Is the standard C assert(e) macro permitted to evaluate e multiple times? What about C++11 or later? I don't see any guarantees in the Open Group spec, and the answer isn't apparent to me from some searching (1, 2).
Context: could func() be called multiple times in assert(func() != NULL)?
Yes, I already know this is a bad idea for other reasons: as the glibc manual points out, the argument of assert() won't be evaluated at all if NDEBUG is defined. However, assuming NDEBUG is not defined, is there any guarantee on the maximum number of times e is evaluated?
Question prompted by this one.
The C standard says
In the C11 standard (ISO/IEC 9899:2011), §7.1.4 Use of library functions says:
Each of the following statements applies unless explicitly stated otherwise in the detailed descriptions that follow: …
Any invocation of a library function that is implemented as a macro shall expand to code that evaluates each of its arguments exactly once, fully protected by parentheses where necessary, so it is generally safe to use arbitrary expressions as arguments.186) Likewise, those function-like macros described in the following subclauses may be invoked in an expression anywhere a function with a compatible return type could be called.187)
186) Such macros might not contain the sequence points that the corresponding function calls do.
187) Because external identifiers and some macro names beginning with an underscore are reserved, implementations may provide special semantics for such names. For example, the identifier _BUILTIN_abs could be used to indicate generation of in-line code for the abs function. Thus, the appropriate header could specify
#define abs(x) _BUILTIN_abs(x)
for a compiler whose code generator will accept it. In this manner, a user desiring to guarantee that a given library function such as abs will be a genuine function may write
#undef abs
whether the implementation’s header provides a macro implementation of abs or a built-in implementation. The prototype for the function, which precedes and is hidden by any macro definition, is thereby revealed also.
The preamble in §7.2 Diagnostics <assert.h> says:
The assert macro shall be implemented as a macro, not as an actual function. If the macro definition is suppressed in order to access an actual function, the behavior is undefined.
And section §7.2.1.1 The assert macro says:
The assert macro puts diagnostic tests into programs; it expands to a void expression. When it is executed, if expression (which shall have a scalar type) is false (that is, compares equal to 0), the assert macro writes information about the particular call that failed (including the text of the argument, the name of the source file, the source line
number, and the name of the enclosing function — the latter are respectively the values of the preprocessing macros __FILE__ and __LINE__ and of the identifier __func__) on the standard error stream in an implementation-defined format.191) It then calls the abort function.
191) The message written might be of the form:
Assertion failed:expression, functionabc, filexyz, linennn.
A possible interpretation of the standard
So much for the verbiage of the standard — how does that translate in practice?
A lot hinges on the interpretation of the statement:
Any invocation of a library function that is implemented as a macro shall expand to code that evaluates each of its arguments exactly once
If assert is regarded as a function that is implemented via a macro, then its argument shall be evaluated just once (the conversion to string is a compile-time operation that does not evaluate the expression).
If assert is regarded as 'not a function' (because it is explicitly a macro), then the restriction quoted doesn't necessarily apply to it.
In practice, I'm sure that the intent is that the expression argument to assert should only be evaluated once (and that only if NDEBUG was not defined when the <assert.h> header was last included) — so I'd regard it as being constrained as if it was a function that is implemented via a macro. I'd also regard any implementation that implemented assert in such a way that the expression was evaluated twice as defective. I'm not certain that the quoted material supports that, but it is all the relevant material I know of in the standard.
Related
An answer to the question Disable check for override in gcc suggested using -Doverride= on the command line to disable errors for erroneous use of override, which is effectively the same as adding:
#define override
to the source file.
My initial reaction was that this seems like undefined behavior since we are redefining a keyword but looking at the draft C++11 standard section 2.12 Keywords [lex.key] I was surprised that neither override nor final are keywords. They are covered in the previous section 2.11 [lex.name] which says they are identifiers with special meaning:
The identifiers in Table 3 have a special meaning when appearing in a
certain context[...]
and Table 3 is labelled Identifiers with special meaning and includes both override and final.
The question is, is it undefined behavior to redefine(using #define) identifiers with special meaning? Are they treated any differently than keywords in this respect?
If you are using the C++ standard library it is undefined behavior to redefine identifiers with special meaning, this also applies to keywords. From the draft C++11 standard under section 17.6.4 [constraints] we have section 17.6.4.1 [constraints.overview] which says:
This section describes restrictions on C++ programs that use the
facilities of the C++ standard library [...]
and under 17.6.4 we have section 17.6.4.3.1 [macro.names] which says:
A translation unit shall not #define or #undef names lexically
identical to keywords, to the identifiers listed in Table 3, or to the
attribute-tokens described in 7.6.
Table 3 list the Identifiers with special meaning. We can see this paragraph also covers keywords and they are treated in the same manner.
Implementations' standard header files are allowed to "implement" standard functions using macros in cases where a macro could meet the requirements for the function (including ensuring that arguments are evaluated exactly once). Further, such macros are allowed to make use of keywords or identifiers whose behavior is specified in in the standard or "reserved to the implementation"; use of such macros in contexts where the keywords or identifiers have been redefined could have arbitrary effects.
That having been said, the historical interpretation of this form of UB would be to say that compilers shouldn't go out of their way to cause wacky behavior, and outside of "pedantic modes" should allow user code to assign meanings to reserved identifiers the compiler would otherwise not use. This can be helpful in cases where code should be usable both on compilers which would require a keyword like __packed, and on compilers which neither recognize nor require such a keyword.). Redefining keywords in the fashion you're doing is a bit dodgier; it will probably work, but there's a significant likelihood that that it will disrupt the behavior of a standard-library macro.
Again found once of those arcane function definitions inside linux kernel. The signature of the function reads:
static void __sched __schedule(void)
Now it has both void and __sched as the return type. Can someone please explain what those identifiers are doing there. Shouldn’t it be either voidor__sched`? How can it be both?
This is the definition of __sched:
#define __sched __attribute__((__section__(".sched.text")))
void is a standard C type, indicating that the function doesn't return a result.
__sched is a macro that expands in accordance with the definition you quoted, making the declaration equivalent to:
static void __attribute__((__section__(".sched.text"))) __schedule(void)
__attribute__ is a language extension supported by gcc (and by compilers compatible with gcc). Its meaning is documented in the gcc manual. It specifies that the generated code for the function should be placed in a specified section in the object file.
Since __sched, or rather the sequence that it expands to, is not a type name, there's no conflict between it and void.
(The double parentheses in the syntax of __attribute__ allow a macro definition like
#define __attribute__(arg)
to be used if you want to compile the code with a compiler that doesn't support that extension, causing it to be ignored rather than treated as a syntax error. Some attributes take multiple arguments; wrapping the entire argument list in an extra set of parentheses allows the entire list to be treated, as far as the preprocessor is concerned, as a single argument.)
In C++11 when a preprocessing directive of the form...
#if expr
...is encountered,expr is evaluated as a constant-expression as described in 16.1 [cpp.cond].
This is done after macro replacement on expr, its identifiers (and keywords) are replaced by 0, its preprocessing-tokens are converted to tokens, defined operator is evaluated, and so on.
My question is what happens when one of the tokens in expr is a user-defined-literal?
User defined literals are like function calls, but function calls can't occur in expr (I think), as a side effect of the identifier replacement. However technically user-defined-literals could survive.
I suspect it is an error, but I can't quite see how to conclude that from the standard?
Perhaps the (pedantic) impact of adding user defined literals on clause 16 [cpp] was simply ignored?
Or am I missing something?
Update:
To clarify by an example:
What does this preprocess to:
#if 123_foo + 5.5 > 100
bar
#else
baz
#endif
bar or baz or is it an error?
GCC 4.7 reports:
test.cpp:1:5: error: user-defined literal in preprocessor expression
so it thinks it is an error. Can this be justified with reference to the standard? Or is this just "implicit"?
In C++11 when a preprocessing directive of the form... #if expr ...is encountered,
expr is evaluated as a constant-expression as described in 16.1 [cpp.cond].
This is done after macro replacement on expr, its identifiers (and keywords) are
replaced by 0, its preprocessing-tokens are converted to tokens,
defined operator is evaluated, and so on.
My question is what happens when one of the tokens in expr is a
user-defined-literal?
The program is ill-formed.
The core of my argument is gleaned from the observation in 16.1/1 footnote 147, that in translation phase 4 there are no identifiers other than macro names yet.
Argument:
According to 2.14.8 [lex.ext]/2
A user-defined-literal is treated as a call to a literal operator
or literal operator template (13.5.8).
So here we have a remaining call to an (operator) function even after all the substitutions described in 16.1/4. (Other attempts, for example to use a constexpr function, would be thwarted by the substitution of all non-macro identifiersby 0.)
As this occurs in translation phase 4, there are no defined or even declared functions yet; an attempted lookup of the literal-operator-id must fail (see footnote 147 in 16.1/1 for a similar argument).
From a slightly different angle, looking at 5.19/2we find:
A conditional-expression is a core constant expression unless it
involves one of the following as a potentially evaluated subexpression
(3.2) [...]:
[...]
an invocation of a function other than a constexpr constructor for a literal class or a constexpr function;
an invocation of an undefined constexpr function or an undefined constexpr constructor [...];
From this, use of a user-defined literal in a constant expression requires a defined and constexpr literal operator, which again can't be available in translation phase 4.
gcc is right to reject this.
In C++11 when a preprocessing directive of the form #ifdef expr is encountered, expr is evaluated as a constant-expression as described 16.1. This is done after macro replacement on expr, its identifiers (and keywords) are replaced by 0, its preprocessing-tokens are converted to tokens, defined operator is evaluated, and so on.
No!
The argument to #ifdef, #ifndef, or defined is not evaluated. For example, suppose I never #define the preprocessor symbol SYMBOL_THAT_IS_NEVER_DEFINED. This is perfectly valid:
#ifdef SYMBOL_THAT_IS_NEVER_DEFINED
code
#endif
Expanding a symbol that symbol isn't defined is illegal. This is illegal assuming SYMBOL_THAT_IS_NEVER_DEFINED hasn't been defined:
#if SYMBOL_THAT_IS_NEVER_DEFINED
code
#endif
Analogous to checking whether a pointer is non-null before dereferencing it, checking whether a symbol is defined before using it is legal:
#if (defined SYMBOL_THAT_MIGHT_BE_DEFINED) && SYMBOL_THAT_MIGHT_BE_DEFINED
code
#endif
I know that this code is valid both in C and C++:
#define FOO 0
#define FOO 0
ISO/IEC 14882:2011
16.3 Macro replacement [cpp.replace]
2 An identifier currently defined as an object-like macro may be
redefined by another #define preprocessing directive provided that the
second definition is an object-like macro definition and the two
replacement lists are identical, otherwise the program is ill-formed.
Likewise, an identifier currently defined as a function-like macro may
be redefined by another #define preprocessing directive provided that
the second definition is a function-like macro definition that has the
same number and spelling of parameters, and the two replacement lists
are identical, otherwise the program is ill-formed.
But what about this code?
#define FOO 0
#define FOO FOO
Replacement lists are not identical at the start of preprocessing (only when the first replacement occurs).
This is not allowed in either C or C++. The replacement list must be identical. What you're talking about (after the first pass) is the result of processing the replacement list1, not the replacement list itself. Since the replacement list itself is not identical, the code is not allowed.
1 Or at least what the result would be if the preprocessor worked a particular way that happens to be different from how it actually does.
In this article from Guru of the week, it is said: It is illegal to #define a reserved word. Is this true? I can’t find anything in the norm, and I have already seen programmers redefining new, for instance.
17.4.3.1.1 Macro names [lib.macro.names]
1 Each name defined as a macro in a header is reserved to the implementation for any use if the translation unit includes the header.164)
2 A translation unit that includes a header shall not contain any macros that define names declared or defined in that header. Nor shall such a translation unit define macros for names lexically identical to keywords.
By the way, new is an operator and it can be overloaded (replaced) by the user by providing its own version.
The corresponding section from C++11:
17.6.4.3.1 Macro names [macro.names]
1 A translation unit that includes a standard library header shall not #define or #undef names declared in any standard library header.
2 A translation unit shall not #define or #undef names lexically identical to keywords.
Paragraph 1 from C++03 has been removed. The second paragraph has been split in two. The first half has now been changed to specifically state that it only applies to standard headers. The second point has been broadened to include any translation unit, not just those that include headers.
However, the Overview for this section of the standard (17.6.4.1 [constraints.overview]) states:
This section describes restrictions on C++ programs that use the facilities of the C++ standard library.
Therefore, if you are not using the C++ standard library, then you're okay to do what you will.
So to answer your question in the context of C++11: you cannot define (or undefine) any names identical to keywords in any translation unit if you are using the C++ standard library.
They're actually wrong there, or at least doesn't tell the whole story about it. The real reason it's disallowed is that it violates the one-definition-rule (which by the way is also mentioned as the second reason why it's illegal).
To see that it's actually allowed (to redefine keywords), at least if you don't use the standard libraries, you have to look at an entirely different part of the standard, namely the translation phases. It says that the input is only decomposed into preprocessor tokens before preprocessing takes place and looking at those there's no distinction between private and fubar, they are both identifiers to the preprocessor. Later when the input is decomposed into token the replacement has already taken place.
It has been pointed out that there's a restriction on programs that are to use the standard libraries, but it's not evident that the example redefining private is doing that (as opposed to the "Person #4: The Language Lawyer" snippet which uses it for output to cout).
It's mentioned in the last example that the trick doesn't get trampled on by other translation units or tramples on other. With this in mind you should probably consider the possibility that the standard library is being used somewhere else which will put this restriction in effect.
Here's a little thing you can do if you don't want someone to use goto's. Just drop the following somewhere in his code where he won't notice it.
#define goto { int x = *(int *)0; } goto
Now every time he tries to use a goto statement, his program will crash.
It's not as far as I'm aware illegal - no compiler I've come across yet will generate an error if you do
#define true false
#defining certain keywords are likely to generate errors in compilation for other reasons. But a lot of them will just result in very strange program behaviour.