Dereferencing strings in preprocessor expressions - c++

My reading of the draft standard documents suggests that it should be legal to dereference a string literal, either with a unary * or with a constant subscript, in a preprocessor expression. For instance, I should be able to say (using the predefined __ DATE __ macro which expands to a quoted string):
#if *__DATE__ == 'A'
or
#if __DATE__[0] == 'A'
If I do this in GCC, with -std=gnu++0x, the former complains
error: operator '*' has no left operand
and the latter complains
error: token ""Feb 16 2016"" is not valid in preprocessor expressions
The standards don't seem to define constant-expression any differently between the compiler and the preprocessor. The compiler happily compiles stuff like:
int foo[*__DATE__];
or
int foo[__DATE__[0]];
at global scope, proving that these are legitimate constant expressions.
I call foul. It seems to me that the standard requires the preprocessor to handle these types of expressions in #if or #elif clauses. Does anyone have any counterargument, before I go and report this as a GCC bug?

Your technique works in code, like an if (*_ _ DATE _ _ == 'A') statement, but not in an #IF macro. The preprocessor won't do that sort of expression evaluation.

Related

Do 'true' and 'false' have their usual meaning in preprocessor conditionals?

Given a C++11 compiler, which #error is the correct one it should end up with?
// no #includes!
#define SOMEMACRO true
#if SOMEMACRO
#error "it was true"
#else
#error "it was false"
#endif
Godbolt demo
Obviously I'm using #error just as a test. I know true and false are defined in the language proper, but this is preprocessor context. In C99 it seems not to be recognised by the preprocessor.
I'm asking because it seems that all compilers I tried see it as 'true', while a static code analysis tool insists that true isn't defined, implicitly false and ends up in "it was false".
In all ISO C++ standards, both true and false are keyword constants, just like nullptr in C++11. So #if SOMEMACRO = #if true and the preprocessor will go to the truthy branch.
In C, however, neither true nor false is ever a keyword. They're macros defined to 1 and 0 respectively, as of C99 and with #include <stdbool.h>. This does mean that however, if you don't include stdbool.h, the compiler should complain about unrecognized identifiers for true, false etc. After including the header, #if SOMEMACRO is now #if 1, which is truthy in C.
For preprocessing, this quote from CppReference is meaningful:
Any identifier, which is not literal, non defined using #define directive, evaluates to 0.
So in your (probably C-oriented) static analysis tool, it sees true as a non-#define-defined identifier, and therefore evaluates true to zero. You're not going to observe this behavior if you use a C++ analysis tool.
In that case, you probably shouldn't have missed the #include <stdbool.h> in the first place, though.
According to [cpp.cond]/4 in the C++11 standard:
Prior to evaluation, macro invocations in the list of preprocessing tokens that will become the controlling constant expression are replaced (except for those macro names modified by the defined unary operator), just as in normal text. […] After all replacements due to macro expansion and the defined unary operator have been performed, all remaining identifiers and keywords, except for true and false, are replaced with the pp-number 0, and then each preprocessing token is converted into a token. The resulting tokens comprise the controlling constant expression which is evaluated according to the rules of [expr.const] using arithmetic that has at least the ranges specified in [support.limits]. […] Each subexpression with type bool is subjected to integral promotion before processing continues.
Emphasis mine; from the bolded passages it follows that bool-typed expressions are meant to be supported in preprocessor conditions just like in the language proper, including bool literals true and false. The [expr.const] section defining constant expressions is referred to from other sections that use it in non-preprocessing context, from which it follows that the evaluation rules are the same in the preprocessor and the language proper.
I’d assume similar language appears in all further revisions of the C++ standard, and probably in earlier ones too. In C, on the other hand, true and false are not keywords, but macros defined in stdbool.h, so the preprocessor treats them just like any other token.
The usual practice is to use 1 and 0 for logical values in preprocessor expressions for maximum portability, and preferably to avoid directly referring to them entirely.
As other answers already pointed out correctly, true and false should work there with C++ compilers.
OP here: it was indeed a configuration problem of the SCA tool. In Helix, the option -preproccppkeywords, which says "When enabled, the C++ alternative tokens are treated as keywords." was responsible for this. When switching on, it behaves as expected. true and false are recognized during preprocessing.

How to make GCC reduce 'arg,##__VA_ARGS__' to 'arg' to use it as single macro argument?

I have these macros defined for visual studio and clang and they both compile fine
#if defined(_MSC_VER)
# define _declare_func(...) PP_CAT(PP_CAT(_declare_func_, PP_NARG(__VA_ARGS__)),(__VA_ARGS__))
# define declare_func(...) _declare_func PP_LEFT_PAREN notused,##__VA_ARGS__ PP_RIGHT_PAREN
#else // clang version
# define _declare_func(...) PP_CAT(_declare_func_, PP_NARG(__VA_ARGS__))(__VA_ARGS__)
# define declare_func(...) _declare_func ( notused,##__VA_ARGS__ )
#endif
#define _declare_func_1(notused) void my_function()
#define _declare_func_2(notused, scope) void scope::my_function()
class MyClass
{
declare_func();
};
declare_func(MyClass) { }
PP_CAT is a classic multilevel concat macro
PP_NARG counts the number of macro arguments
PP_LEFT_PAREN and PP_RIGHT_PAREN reduce to '(' and ')'
Is there any way to achieve this with GCC ? ( I tried both macro version with GCC 5.2, both fail to compile because the comma seem to be propagated during macro resolution and removed only at the end of preprocessing, making PP_NARG always reduce to '2' and never '1')
Thanks !
From doc:
Second, the ‘##’ token paste operator has a special meaning when placed between a comma and a variable argument. If you write
#define eprintf(format, ...) fprintf (stderr, format, ##__VA_ARGS__)
and the variable argument is left out when the eprintf macro is used, then the comma before the ‘##’ will be deleted. This does not happen if you pass an empty argument, nor does it happen if the token preceding ‘##’ is anything other than a comma.
eprintf ("success!\n")
==> fprintf(stderr, "success!\n");
The above explanation is ambiguous about the case where the only macro parameter is a variable arguments parameter, as it is meaningless to try to distinguish whether no argument at all is an empty argument or a missing argument. In this case the C99 standard is clear that the comma must remain, however the existing GCC extension used to swallow the comma. So CPP retains the comma when conforming to a specific C standard, and drops it otherwise.
So for
#define declare_func(...) _declare_func ( notused,##__VA_ARGS__ )
the comma remain in C standard, you may use -std=gnu99 or -std=gnu++11 to drop the comma and have your working macro.
Demo
To make your macro works with -std=c++11, you have so to force to have at least one argument.

C++ directive spelling error [duplicate]

My preprocessor appears to assume that undefined constants are 0 for the purpose of evaluating #if conditions.
Can this be relied upon, or do undefined constants give undefined behaviour?
Yes, it can be relied upon. The C99 standard specifies at §6.10.1 ¶3:
After all replacements due to macro expansion and the defined unary
operator have been performed, all remaining identifiers are replaced with the pp-number
0
Edit
Sorry, I thought it was a C question; still, no big deal, the equivalent section in the C++ standard (§16.1 ¶4) states:
After all replacements due to macro expansion and the defined unary operator
have been performed, all remaining identifiers and keywords, except for true and false, are replaced with the pp-number 0
The only difference is the different handling of true and false, which in C do not need special handling, while in C++ they have a special meaning even in the preprocessing phase.
An identifier that is not defined as a macro is converted to 0 before the expression is evaluated.
The exception is the identifier true, which is converted to 1. This is specific to the C++ preprocessor; in C, this doesn't happen and you would need to include <stdbool.h> to use true this way, in which case it will be defined as a macro and no special handling is required.
The OP was asking specifically about the C preprocessor and the first answer was correctly referring to the C preprocessor specification. But some of the other comments seem to blur the distinction between the C preprocessor and the C compiler. Just to be clear, those are two different things with separate rules and they are applied in two separate passes.
#if 0 == NAME_UNDEFINED
int foo = NAME_UNDEFINED;
#endif
This example will successfully output the foo definition because the C preprocessor evaluates NAME_UNDEFINED to 0 as part of a conditional expression, but a compiler error is generated because the initializer is not evaluated as a conditional expression and then the C compiler evaluates it as an undefined symbol.

Is there logical short-circuiting in the C preprocessor?

The gcc docs for cpp explain about the #if directive:
[...] and logical operations (&& and ||). The latter two obey the usual short-circuiting rules of standard C.
What does that mean? There is no evaluation of expressions during preprocessing, so how can it be short-circuited?
Very simple: undefined macros have numeric value zero, and division by zero is illegal.
#if FIXEDSIZE && CHUNKSIZE/FIXEDSIZE > 42
#define USE_CELLPOOL
#endif
#if does evaluate the rest of its line as an integer constant expression. Your linked documentation begins:
The ‘#if’ directive allows you to test the value of an arithmetic expression, rather than the mere existence of one macro.
That isn't a gcc extension, the Standard's syntax for #if is
#ifconstant-expression new-line groupopt.
The C99 preprocessor treats all constants as [u]intmax_t.
What they are referring to is && and || operators for #if
#if defined (AAA) || defined (BBB)
If defined (AAA) is defined then defined (BBB) is never evaluated.
UPDATE
So running the calculation will be short circuited. For example, if you build with -Wundef to warn about the usage of undefined macros.
#if defined FOO && FOO > 1000
#endif
#if FOO > 1000
#endif
will result in
thomas:~ jeffery$ gcc foo.c -Wundef
foo.c:4:5: warning: 'FOO' is not defined, evaluates to 0 [-Wundef]
#if FOO > 1000
^
1 warning generated.
So the first version does not generate the undefined macro warning, because FOO > 1000 is not evaluated.
OLD MUSINGS
This become important if the second part is a macro which has side effects. The macro would not be evaluated, so the side effects would not take place.
To avoid macro abuse I'll give a somewhat sane example
#define FOO
#define IF_WARN(x) _Pragma (#x) 1
#if defined(FOO) || IF_WARN(GCC warning "FOO not defined")
#endif
Now that I constructed this example, I now run into a problem. IF_WARN is always evaluated.
huh, more research needed.
Well foo… now that I read it again.
Macros. All macros in the expression are expanded before actual computation of the expression's value begins.
There is no evaluation of expressions during preprocessing, so how can it be short-circuited?
Yes there is evaluation of expression during preprocessing.
C11: 6.10.1 Conditional inclusion (p4):
Prior to evaluation, macro invocations in the list of preprocessing tokens that will become ...
In a footnote 166:
Because the controlling constant expression is evaluated during translation phase 4, all identifiers....
These statements clearly testify that there is evaluation of expression in preprocessing. The necessary condition is that the controlling expression must evaluate to an integer value.
Now the operator && and || will obey the usual short-circuiting rules of standard C as stated in GNU doc.
Now run this program with and without // and see the result to see the short-circuit behavior:
#include<stdio.h>
#define macro1 1
//#define macro2 1
int main( void )
{
#if defined (macro1) && defined (macro2)
printf( "Hello!\n" );
#endif
printf("World\n");
return 0;
}
Evaluating macro conditions is a part (a major part) of pre-processing, so it occurs and short-circuiting is meaningful there. You can see examples of the other answers.
A conditional is a directive that instructs the preprocessor to select
whether or not to include a chunk of code in the final token stream
passed to the compiler. Preprocessor conditionals can test arithmetic
expressions, or whether a name is defined as a macro, or both
simultaneously using the special defined operator.†
Moreover, it can reduce the compile time. Altering the following evaluations can speed up the compilation (depeds on implementation of a compiler).

What is the value of an undefined constant used in #if?

My preprocessor appears to assume that undefined constants are 0 for the purpose of evaluating #if conditions.
Can this be relied upon, or do undefined constants give undefined behaviour?
Yes, it can be relied upon. The C99 standard specifies at §6.10.1 ¶3:
After all replacements due to macro expansion and the defined unary
operator have been performed, all remaining identifiers are replaced with the pp-number
0
Edit
Sorry, I thought it was a C question; still, no big deal, the equivalent section in the C++ standard (§16.1 ¶4) states:
After all replacements due to macro expansion and the defined unary operator
have been performed, all remaining identifiers and keywords, except for true and false, are replaced with the pp-number 0
The only difference is the different handling of true and false, which in C do not need special handling, while in C++ they have a special meaning even in the preprocessing phase.
An identifier that is not defined as a macro is converted to 0 before the expression is evaluated.
The exception is the identifier true, which is converted to 1. This is specific to the C++ preprocessor; in C, this doesn't happen and you would need to include <stdbool.h> to use true this way, in which case it will be defined as a macro and no special handling is required.
The OP was asking specifically about the C preprocessor and the first answer was correctly referring to the C preprocessor specification. But some of the other comments seem to blur the distinction between the C preprocessor and the C compiler. Just to be clear, those are two different things with separate rules and they are applied in two separate passes.
#if 0 == NAME_UNDEFINED
int foo = NAME_UNDEFINED;
#endif
This example will successfully output the foo definition because the C preprocessor evaluates NAME_UNDEFINED to 0 as part of a conditional expression, but a compiler error is generated because the initializer is not evaluated as a conditional expression and then the C compiler evaluates it as an undefined symbol.