Conditional inclusion: integral constant expression is unlimited? - c++

Per C++11 (and newer) this code is valid:
#if 1.0 > 2.0 ? 1 : 0
#endif
However, most (if not all) C++ compilers reject it:
$ echo "#if 1.0 > 2.0 ? 1 : 0" | g++ -xc++ - -std=c++11 -pedantic -c
<stdin>:1:5: error: floating constant in preprocessor expression
<stdin>:1:11: error: floating constant in preprocessor expression
N4849 has this (emphasis added):
The expression that controls conditional inclusion shall be an integral constant expression except that identifiers (including those lexically identical to keywords) are interpreted as described below143 and it may contain zero or more defined-macro-expressions and/or has-include-expressions and/or has-attribute-expressions as unary operator expressions.
and this (emphasis added):
An integral constant expression is an expression of integral or unscoped enumeration type, implicitly converted to a prvalue, where the converted expression is a core constant expression.
The 1.0 > 2.0 ? 1 : 0 is integral constant expression.
So, where C++ standard prohibits using floating-point literal (for example) in the expression that controls conditional inclusion?

Answer from Richard Smith:
This is an error in the standard wording. See http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#1436 for details and a proposed fix -- though that fix is known to be wrong too (it permits lambda-expressions).

Related

What kinds of expressions are allowed in a `#if` (the conditional inclusion preprocesssor directives) [duplicate]

This question already has an answer here:
Conditional inclusion: integral constant expression is unlimited?
(1 answer)
Closed 9 months ago.
Many sources online (for example, https://en.cppreference.com/w/cpp/preprocessor/conditional#Condition_evaluation) say that the expression need only be an integer constant expression.
The following are all integral constant expressions without any identifiers in them:
#include <compare>
#if (1 <=> 2) > 0
#error 1 > 2
#endif
#if (([]{}()), 0)
#error 0
#endif
#if 1.2 < 0.0
#error 1.2 < 0.0
#endif
#if ""[0]
#error null terminator is true
#endif
#if *""
#error null terminator is true
#endif
Yet they fail to compile with clang or gcc, so there obviously are some limitations.
The grammar for the #if directive is given in [cpp.pre] in the standard as:
if-group:
# if constant-expression new-line groupopt
All of the previous expressions fit the grammar of constant-expression.
It goes on later to say (in [cpp.cond]):
1/
The expression that controls conditional inclusion shall be an integral constant expression except that identifiers (including those lexically identical to keywords) are interpreted as described below
8/
Each preprocessing token that remains (in the list of preprocessing tokens that will become the controlling expression) after all macro replacements have occurred shall be in the lexical form of a token.
All of the preprocessing tokens seem to be in the form of [lex.token]:
token:
identifier
keyword
literal
operator-or-punctuator
<=>, >, [, ], {, }, (, ), * are all an operator-or-punctuator
1, 2, 0, 1.2, 0.0, "" are all literals
So what part of the standard rules out these forms of expressions? And what subset of integral constant expressions are allowed?
I think that all of these examples are intended to be ill-formed, although as you demonstrate the current standard wording doesn't have that effect.
This seems to be tracked as active CWG issue 1436. The proposed resolution would disqualify string literals, floating point literals and also <=> from #if conditions. (Although <=> was added to the language after the issue description was written.) I suppose it is also meant to disallow lambdas, but that may not be covered by the proposed wording.

Is there a constant expression that is not a core constant expression?

According to cppref, a constant expression is not bound to be a core constant expression.
My question:
Is there a constant expression that is not a core constant expression?
[expr.const]/5 defines "constant expression" as:
A constant expression is either a glvalue core constant expression
that refers to an entity that is a permitted result of a constant
expression (as defined below), or a prvalue core constant expression
whose value satisfies the following constraints: [...]
There is no such thing as a constant expression that isn't a core constant expression, and cppreference doesn't claim otherwise.
[expr.const] lists a whole series of things that are not core constant expressions. These include signed integer overflow (65536 * 32768 on a 32 bit machine), division by zero, and certain shift operations.

Why literals are considered expressions in C++?

I'm studying the C++ programming language using Programming priciples and practice using C++.
I'm in chapter 4 now and in this chapter the book introduces the concept of expression, but I can't understand it at all :
The most basic building block in a program is an expression. An espression compute a value from a number of operands. The simplest expression in C++ is simply a literal value such as 11, 'c', "hello". Names of variables are also expressions. A variable represent the object which it is the name.
Why a literal is considered an expression ? Why the name of a variable is considered an expression ?
Expressions -in programming languages, in math, in linguistics- are defined compositionally (or inductively). So expressions are often made of subexpressions like x*2+y*4 is made of two sub-expressions x*2 and y*4 joined by the addition operator +.
But you need a base case (the most atomic and simple expressions). These are literals (2) and variables (x) - if either of them was not an expression 2*x could not be an expression (since both operands of the binary multiplication * are sub-expressions).
Notice that in C and C++ assignments and function calls are expressions
Think of it like this: An expression is a sequence of steps that produce a value. Thus, 4+3 is a two-step expression, because you (1) start with the number 4, and (2) add 3 to it.
Therefore, 7 can be regarded as a single-step sequence, because there is only one "action" performed: (1) start with the number 7.
Thus, both a = 4+3; and a = 7; can be generalised to a = <expression>;.
An expression is "a sequence of operators and operands that specifies a computation" (http://en.cppreference.com/w/cpp/language/expressions).
Let see a simple expression: 3 + 3. When you evaluate this expression, you will get the result 6.
So let see another expression: 3. When you evaluate this expression, you will get the result 3.
A literal is considered an expression because a literal is a type of constant and constants are expressions with a fixed value.
A variable is also considered as an expression because it can be used as an operand within another expression or as an expression by itself.
In software design, composite pattern can be used as a representation of the expression.

Is it valid to use boolean literals in preprocessor conditionals?

Consider the following code, which results in the boolean literal true being evaluated in a preprocessor conditional:
#define SOME_MACRO true
int main ()
{
#if SOME_MACRO
return 1;
#else
return 0;
#endif
}
Clang 3.4 and GCC 4.8 both accept this code, even with -pedantic -std=c++11 -Wall -Wextra.
Visual Studio 2013 rejects it, with fatal error C1017: invalid integer constant expression.
My reading of n3376 §
16.1 is that the regular C++ rules for evaluating constant expressions should apply.
If so, this code is valid, and it's a bug if MSVC does not accept it.
But I don't find the standardeze particularly clear. Could someone confirm this?
Yes, it is valid. See C++11 §16.1/4 (emphasis mine)
Prior to evaluation, macro invocations in the list of preprocessing tokens that will become the controlling
constant expression are replaced (except for those macro names modified by the defined unary operator),
just as in normal text. If the token defined is generated as a result of this replacement process or use
of the defined unary operator does not match one of the two specified forms prior to macro replacement,
the behavior is undefined. After all replacements due to macro expansion and the defined unary operator
have been performed, all remaining identifiers and keywords, except for true and false, are replaced
with the pp-number 0, and then each preprocessing token is converted into a token. The resulting tokens
comprise the controlling constant expression which is evaluated according to the rules of 5.19 using arithmetic
that has at least the ranges specified in 18.3. For the purposes of this token conversion and evaluation all
signed and unsigned integer types act as if they have the same representation as, respectively, intmax_t
or uintmax_t (18.4). This includes interpreting character literals, which may involve converting escape
sequences into execution character set members. Whether the numeric value for these character literals
matches the value obtained when an identical character literal occurs in an expression (other than within a
#if or #elif directive) is implementation-defined. Also, whether a single-character character literal may
have a negative value is implementation-defined. Each subexpression with type bool is subjected to integral
promotion before processing continues.

Do parentheses force order of evaluation and make an undefined expression defined?

I was just going though my text book when I came across this question:
What would be the value of a after the following expression? Assume the initial value of a = 5. Mention the steps.
a+=(a++)+(++a)
At first I thought this is undefined behaviour because a has been modified more than once. Then I read the question and it said "Mention the steps" so I probably thought this question is right.
Does applying parentheses make an undefined behaviour defined?
Is a sequence point created after evaluating a parentheses expression?
If it is defined,how do the parentheses matter since ++ and () have the same precedence?
No, applying parentheses doesn't make it a defined behaviour. It's still undefined. The C99 standard §6.5 ¶2 says
Between the previous and next sequence point an object shall have its
stored value modified at most once by the evaluation of an expression.
Furthermore, the prior value shall be read only to determine the value
to be stored.
Putting a sub-expression in parentheses may force the order of evaluation of sub-expressions but it does not create a sequence point. Therefore, it does not guarantee when the side effects of the sub-expressions, if they produce any, will take place. Quoting the C99 standard again §5.1.2.3¶2
Evaluation of an expression may produce side effects. At certain
specified points in the execution sequence called sequence points, all
side effects of previous evaluations shall be complete and no side
effects of subsequent evaluations shall have taken place.
For the sake of completeness, following are sequence points laid down by the C99 standard in Annex C.
The call to a function, after the arguments have been evaluated.
The end of the first operand of the following operators: logical AND &&; logical OR ||; conditional ?; comma ,.
The end of a full declarator.
The end of a full expression; the expression in an expression statement; the controlling expression of a selection statement (if
or switch); the controlling expression of a while or do
statement; each of the expressions of a for statement; the
expression in a return statement.
Immediately before a library function returns.
After the actions associated with each formatted input/output function conversion specifier.
Immediately before and immediately after each call to a comparison function, and also between any call to a comparison function and any
movement of the objects passed as arguments to that call.
Adding parenthesis does not create a sequence point and in the more modern standards it does not create a sequenced before relationship with respect to side effects which is the problem with the expression that you have unless noted the rest of this will be with respect to C++11. Parenthesis are a primary expression covered in section 5.1 Primary expressions, which has the following grammar (emphasis mine going forward):
primary-expression:
literal
this
( expression )
[...]
and in paragraph 6 it says:
A parenthesized expression is a primary expression whose type and value are identical to those of the enclosed expression. The presence of parentheses does not affect whether the expression is an lvalue. The parenthesized expression can be used in exactly the same contexts as those where the enclosed expression can be used, and with the same meaning, except as otherwise indicated.
The postfix ++ is problematic since we can not determine when the side effect of updating a will happen pre C++11 and in C this applies to both the postfix ++ and prefix ++ operations. With respect to how undefined behavior changed for prefix ++ in C++11 see Assignment operator sequencing in C11 expressions.
The += operation is problematic since:
[...]E1 op = E2 is equivalent to E1 = E1 op E2 except that E1 is
evaluated only once[...]
So in C++11 the following went from undefined to defined:
a = ++a + 1 ;
but this remains undefined:
a = a++ + 1 ;
and both of the above are undefined pre C++11 and in both C99 and C11.
From the draft C++11 standard section 1.9 Program execution paragraph 15 says:
Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced. [ Note: In an expression that is evaluated more than once during the execution of a program, unsequenced and indeterminately sequenced evaluations of its subexpressions need not be performed consistently in different evaluations. —end note ] The value computations of the operands of an operator are sequenced before the value computation of the result of the operator. If a side effect on a scalar object is unsequenced relative to either another side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.