In C++11 when a preprocessing directive of the form...
#if expr
...is encountered,expr is evaluated as a constant-expression as described in 16.1 [cpp.cond].
This is done after macro replacement on expr, its identifiers (and keywords) are replaced by 0, its preprocessing-tokens are converted to tokens, defined operator is evaluated, and so on.
My question is what happens when one of the tokens in expr is a user-defined-literal?
User defined literals are like function calls, but function calls can't occur in expr (I think), as a side effect of the identifier replacement. However technically user-defined-literals could survive.
I suspect it is an error, but I can't quite see how to conclude that from the standard?
Perhaps the (pedantic) impact of adding user defined literals on clause 16 [cpp] was simply ignored?
Or am I missing something?
Update:
To clarify by an example:
What does this preprocess to:
#if 123_foo + 5.5 > 100
bar
#else
baz
#endif
bar or baz or is it an error?
GCC 4.7 reports:
test.cpp:1:5: error: user-defined literal in preprocessor expression
so it thinks it is an error. Can this be justified with reference to the standard? Or is this just "implicit"?
In C++11 when a preprocessing directive of the form... #if expr ...is encountered,
expr is evaluated as a constant-expression as described in 16.1 [cpp.cond].
This is done after macro replacement on expr, its identifiers (and keywords) are
replaced by 0, its preprocessing-tokens are converted to tokens,
defined operator is evaluated, and so on.
My question is what happens when one of the tokens in expr is a
user-defined-literal?
The program is ill-formed.
The core of my argument is gleaned from the observation in 16.1/1 footnote 147, that in translation phase 4 there are no identifiers other than macro names yet.
Argument:
According to 2.14.8 [lex.ext]/2
A user-defined-literal is treated as a call to a literal operator
or literal operator template (13.5.8).
So here we have a remaining call to an (operator) function even after all the substitutions described in 16.1/4. (Other attempts, for example to use a constexpr function, would be thwarted by the substitution of all non-macro identifiersby 0.)
As this occurs in translation phase 4, there are no defined or even declared functions yet; an attempted lookup of the literal-operator-id must fail (see footnote 147 in 16.1/1 for a similar argument).
From a slightly different angle, looking at 5.19/2we find:
A conditional-expression is a core constant expression unless it
involves one of the following as a potentially evaluated subexpression
(3.2) [...]:
[...]
an invocation of a function other than a constexpr constructor for a literal class or a constexpr function;
an invocation of an undefined constexpr function or an undefined constexpr constructor [...];
From this, use of a user-defined literal in a constant expression requires a defined and constexpr literal operator, which again can't be available in translation phase 4.
gcc is right to reject this.
In C++11 when a preprocessing directive of the form #ifdef expr is encountered, expr is evaluated as a constant-expression as described 16.1. This is done after macro replacement on expr, its identifiers (and keywords) are replaced by 0, its preprocessing-tokens are converted to tokens, defined operator is evaluated, and so on.
No!
The argument to #ifdef, #ifndef, or defined is not evaluated. For example, suppose I never #define the preprocessor symbol SYMBOL_THAT_IS_NEVER_DEFINED. This is perfectly valid:
#ifdef SYMBOL_THAT_IS_NEVER_DEFINED
code
#endif
Expanding a symbol that symbol isn't defined is illegal. This is illegal assuming SYMBOL_THAT_IS_NEVER_DEFINED hasn't been defined:
#if SYMBOL_THAT_IS_NEVER_DEFINED
code
#endif
Analogous to checking whether a pointer is non-null before dereferencing it, checking whether a symbol is defined before using it is legal:
#if (defined SYMBOL_THAT_MIGHT_BE_DEFINED) && SYMBOL_THAT_MIGHT_BE_DEFINED
code
#endif
Related
#include <iostream>
#define Abc likely
# if __has_cpp_attribute(Abc)
#define Pn 0
#endif
#if __has_cpp_attribute(likely)
#ifndef Pn
#define Pn 1
#endif
#endif
int main(){
std::cout<< Pn;
}
For this example, GCC prints 0 while Clang prints 1. According to [cpp.cond] p5
Each has-attribute-expression is replaced by a non-zero pp-number matching the form of an integer-literal if the implementation supports an attribute with the name specified by interpreting the pp-tokens, after macro expansion, as an attribute-token, and by 0 otherwise. The program is ill-formed if the pp-tokens do not match the form of an attribute-token.
So, the directive # if __has_cpp_attribute(Abc) should behave the same as #if __has_cpp_attribute(likely). GCC has the right behavior. Again, consider this example
#include <iostream>
#define Head <iostream>
# if __has_include(Head)
#define Pn 0
#endif
#ifndef Pn
#define Pn 1
#endif
int main(){
std::cout<< Pn;
}
In this example, both compilers print 0. However, according to [cpp.cond] p4
The header or source file identified by the parenthesized preprocessing token sequence in each contained has-include-expression is searched for as if that preprocessing token sequence were the pp-tokens in a #include directive, except that no further macro expansion is performed. If such a directive would not satisfy the syntactic requirements of a #include directive, the program is ill-formed. The has-include-expression evaluates to 1 if the search for the source file succeeds, and to 0 if the search fails.
Note the bold wording, which means Head won't be replaced by <iostream>, there is no such a source file. Hence, Pn should be 1 instead. Could it be considered a bug of GCC and Clang?
I am not sure that the answer below is correct. I will leave it up for reference for now.
I think the second example does not fit the has-include-expression grammar. If you look at [cpp.cond] there are two forms mentioned, which are further subdivided into multiple cases, referring also to [lex.header].
Collecting the possible forms and combining them here for presentation, we get:
__has_include(<...>)
__has_include("...")
__has_include(string-literal)
with ... as some placeholder and string-literal any string literal. Your form __has_include(Head) is none of these, since Head neither starts with ", nor <, nor is it a string literal.
[cpp.cond]/3 does mention that if the first of the two syntax choices for has-include-expression does not match, the second is considered and the preprocessor tokens are processed like normal text, presumably meaning they are macro-expanded. However it is not clear to me whether this is supposed to reference all preprocessor tokens between ( and ) before the above-mentioned grammar rules are applied or just the h-pp-tokens in the __has_include(<h-pp-tokens>) form. In the former case, the compilers would be correct in returning 0.
However, the latter case makes more sense to me, especially when comparing e.g. to the grammar rule for #include, which uses similar forms, but instead of #include <h-pp-tokens> the last form is #include pp-tokens. [cpp.include]
[cpp.cond]/7 says that the identifier __has_include shall not appear in any context not mentioned in the subclause. I would think that shall not here means otherwise ill-formed, in which case the program should not compile without diagnostic. If it means otherwise undefined behavior, then all compilers are correct.
For the first example, I think you are right. Clang has a recently-fixed bug report regarding the macro expansion here and if you choose Clang trunk on compiler explorer, the result will coincide with GCC already now.
By mistake, I wrote something along the lines of constexpr bool{};, and while GCC and Clang rejected this, MSVC was more than happy to compile it (see Godbolt). From my understanding, functions (and thus constructors) evaluated at compile time cannot have side effects, therefore this can never have any effect, but is it indeed ill-formed?
(In my experience, MSVC tends to be wrong, but in this specific case I didn’t find where the standard forbids this.)
That's just not valid syntax. It is "forbidden" by the standard by virtue of not being a possible grammar production.
A declaration such as
constexpr bool b{};
is a simple-declaration and has the syntax decl-specifier-seq init-declarator-list(opt) ; (see C++17 [dcl.dcl]/1). The keyword constexpr is a decl-specifier, and so is bool (although only some decl-specifiers have an effect on the type; bool does, but constexpr does not).
The rest of the declaration, b{}, is an init-declarator, which consists of a declarator plus an optional initializer, which in this case is {}. (See [dcl.decl]/1.) The declarator is b. In general, a declarator must contain an identifier such as b. See [dcl.decl]/4.
There is a similar grammar production called an abstract-declarator which lacks an identifier (See [dcl.name]/1). Abstract declarators are allowed in particular contexts, such as when writing down a type-id, or in a parameter-declaration-clause (function parameters are allowed to be unnamed). However, an init-declarator must contain a declarator, not an abstract-declarator.
There is no other grammar production that would match constexpr bool{}; either.
I'm going through the BSON source code, and came across something I've never seen before.
Line 22 in bson-macros.h:
#if !defined(BSON_INSIDE) && !defined(BSON_COMPILATION)
#error "Only <bson.h> can be included directly."
#endif
What is the defined(XXXX) macro above? I can guess what it does, but I can't seem to find any documentation about it. Is it specific to some compilers? It gives me a W4 warning on Microsoft Visual C++ (that I'm trying to resolve in my project).
From 6.10.1
The expression that controls conditional inclusion shall be an integer
constant expression except that: identifiers (including those
lexically identical to keywords) are interpreted as described
below;166) and it may contain unary operator expressions of the form
defined identifier
or
defined ( identifier )
which evaluate to 1 if the identifier is currently defined as a macro
name (that is, if it is predefined or if it has been the subject of a
#define preprocessing directive without an intervening #undef directive with the same subject identifier), 0 if it is not.
It is not macro - it is an operator.
Is the standard C assert(e) macro permitted to evaluate e multiple times? What about C++11 or later? I don't see any guarantees in the Open Group spec, and the answer isn't apparent to me from some searching (1, 2).
Context: could func() be called multiple times in assert(func() != NULL)?
Yes, I already know this is a bad idea for other reasons: as the glibc manual points out, the argument of assert() won't be evaluated at all if NDEBUG is defined. However, assuming NDEBUG is not defined, is there any guarantee on the maximum number of times e is evaluated?
Question prompted by this one.
The C standard says
In the C11 standard (ISO/IEC 9899:2011), §7.1.4 Use of library functions says:
Each of the following statements applies unless explicitly stated otherwise in the detailed descriptions that follow: …
Any invocation of a library function that is implemented as a macro shall expand to code that evaluates each of its arguments exactly once, fully protected by parentheses where necessary, so it is generally safe to use arbitrary expressions as arguments.186) Likewise, those function-like macros described in the following subclauses may be invoked in an expression anywhere a function with a compatible return type could be called.187)
186) Such macros might not contain the sequence points that the corresponding function calls do.
187) Because external identifiers and some macro names beginning with an underscore are reserved, implementations may provide special semantics for such names. For example, the identifier _BUILTIN_abs could be used to indicate generation of in-line code for the abs function. Thus, the appropriate header could specify
#define abs(x) _BUILTIN_abs(x)
for a compiler whose code generator will accept it. In this manner, a user desiring to guarantee that a given library function such as abs will be a genuine function may write
#undef abs
whether the implementation’s header provides a macro implementation of abs or a built-in implementation. The prototype for the function, which precedes and is hidden by any macro definition, is thereby revealed also.
The preamble in §7.2 Diagnostics <assert.h> says:
The assert macro shall be implemented as a macro, not as an actual function. If the macro definition is suppressed in order to access an actual function, the behavior is undefined.
And section §7.2.1.1 The assert macro says:
The assert macro puts diagnostic tests into programs; it expands to a void expression. When it is executed, if expression (which shall have a scalar type) is false (that is, compares equal to 0), the assert macro writes information about the particular call that failed (including the text of the argument, the name of the source file, the source line
number, and the name of the enclosing function — the latter are respectively the values of the preprocessing macros __FILE__ and __LINE__ and of the identifier __func__) on the standard error stream in an implementation-defined format.191) It then calls the abort function.
191) The message written might be of the form:
Assertion failed:expression, functionabc, filexyz, linennn.
A possible interpretation of the standard
So much for the verbiage of the standard — how does that translate in practice?
A lot hinges on the interpretation of the statement:
Any invocation of a library function that is implemented as a macro shall expand to code that evaluates each of its arguments exactly once
If assert is regarded as a function that is implemented via a macro, then its argument shall be evaluated just once (the conversion to string is a compile-time operation that does not evaluate the expression).
If assert is regarded as 'not a function' (because it is explicitly a macro), then the restriction quoted doesn't necessarily apply to it.
In practice, I'm sure that the intent is that the expression argument to assert should only be evaluated once (and that only if NDEBUG was not defined when the <assert.h> header was last included) — so I'd regard it as being constrained as if it was a function that is implemented via a macro. I'd also regard any implementation that implemented assert in such a way that the expression was evaluated twice as defective. I'm not certain that the quoted material supports that, but it is all the relevant material I know of in the standard.
I know that this code is valid both in C and C++:
#define FOO 0
#define FOO 0
ISO/IEC 14882:2011
16.3 Macro replacement [cpp.replace]
2 An identifier currently defined as an object-like macro may be
redefined by another #define preprocessing directive provided that the
second definition is an object-like macro definition and the two
replacement lists are identical, otherwise the program is ill-formed.
Likewise, an identifier currently defined as a function-like macro may
be redefined by another #define preprocessing directive provided that
the second definition is a function-like macro definition that has the
same number and spelling of parameters, and the two replacement lists
are identical, otherwise the program is ill-formed.
But what about this code?
#define FOO 0
#define FOO FOO
Replacement lists are not identical at the start of preprocessing (only when the first replacement occurs).
This is not allowed in either C or C++. The replacement list must be identical. What you're talking about (after the first pass) is the result of processing the replacement list1, not the replacement list itself. Since the replacement list itself is not identical, the code is not allowed.
1 Or at least what the result would be if the preprocessor worked a particular way that happens to be different from how it actually does.