Macro concatenation and further expansion

Macro concatenation and further expansion - c++

Can I safely expect this
#define TEMPLATE_DECL_BEGIN_0 template <
#define TEMPLATE_DECL_BEGIN_1 TEMPLATE_DECL_BEGIN_0 typename Arg0
#define TEMPLATE_DECL_BEGIN_2 TEMPLATE_DECL_BEGIN_1 , typename Arg1
#define TEMPLATE_DECL_BEGIN_3 TEMPLATE_DECL_BEGIN_2 , typename Arg2
#define TEMPLATE_DECL(N) TEMPLATE_DECL_BEGIN_ ## N >
TEMPLATE_DECL(0)
TEMPLATE_DECL(1)
TEMPLATE_DECL(2)
TEMPLATE_DECL(3)
to generate
template < >
template < typename Arg0 >
template < typename Arg0 , typename Arg1 >
template < typename Arg0 , typename Arg1 , typename Arg2 >
on any reasonably standard C preprocessor?
My worry is about macro expansion after concatenation after previous replacement: does it work so that after N gets replaced for example by 2 then
TEMPLATE_DECL_BEGIN_2
becomes
TEMPLATE_DECL_BEGIN_1 , typename Arg1
?

Yes. From 6.10.3.3§3 of the C99 standard:
For both object-like and function-like macro invocations, before the replacement list is
reexamined for more macro names to replace, each instance of a ## preprocessing token
in the replacement list (not from an argument) is deleted and the preceding preprocessing
token is concatenated with the following preprocessing token.
And 6.10.3.4§3 :
After all parameters in the replacement list have been substituted and #and ##
processing has taken place, all placemarker preprocessing tokens are removed. Then, the
resulting preprocessing token sequence is rescanned, along with all subsequent
preprocessing tokens of the source ﬁle, for more macro names to replace.
The standard guarantees that x ## y happens before replacing more macro names, so if you construct a macro name at that time, it will be replaced.
This is from the C99 standard, but I highly doubt they changed this secion from the C89 standard, which would be the version that really applies to C++.

Related

Macro expansion problems in MSVC

I wonder why this macro is expanding so much.
#define CONCAT_IMPL(A, B) A##B
#define CONCAT(A, B) CONCAT_IMPL(A, B)
#define EAT(...)
#define TEST(ARG) EXPANDED, ARG) EAT(
#define GET_LAST(A, B) B
int result = 0;
result = GET_LAST(CONCAT(TEST, (1)), 2); // result is 2
result = GET_LAST(TEST(1), 2); // result is 2
result = GET_LAST(EXPANDED, 1) EAT(, 2); // result is 1
I want GET_LAST(CONCAT(TEST, (1)), 2); evaluated value 1.
I'd appreciate it if you could tell me if it's possible on MSVC or if something's missing.

C11 draft:
The sequence of preprocessing tokens bounded by the outside-most matching parentheses forms the list of arguments for the function-like macro. The individual arguments within the list are separated by comma preprocessing tokens, but comma preprocessing tokens between matching inner parentheses do not separate arguments.
GET_LAST(CONCAT(TEST, (1)), 2) is an invocation of the macro GET_LAST with a list of two arguments. One is CONCAT(TEST, (1)) and the other one is 2.
After the arguments for the invocation of a function-like macro have been identified, argument substitution takes place. A parameter in the replacement list, unless preceded by a # or ## preprocessing token or followed by a ## preprocessing token (see below), is replaced by the corresponding argument after all macros contained therein have been expanded. Before being substituted, each argument's preprocessing tokens are completely macro replaced as if they formed the rest of the preprocessing file; no other preprocessing tokens are available.
The first parameter A does not occur in the replacement list of the macro, so nothing is done with the corresponding argument. The second parameter B occurs, so the corresponding argument macro-expands to 2, and the occurrence of B in the replacement list is substituted with the expansion.

How does expansion of the macro parameters work in c++

I just noticed an interesting thing about the expansion of the macro parameters in C++.
I defined 4 macros; 2 of them turn given parameter into string and another 2 try to separate 2 arguments. I passed them argument with macro which expands into , and got the following results:
#define Quote(x) #x
#define String(x) Quote(x)
#define SeparateImpl(first, second) first + second
#define Separate(pair) SeparateImpl(pair)
#define comma ,
int main(){
Quote(1 comma 2); // -> "1 comma 2"
String(1 comma 2); // -> "1 , 2"
SeparateImpl(1 comma 2); // -> 1 , 2 + *empty arg*
Separate(1 comma 2); // -> 1 , 2 + *empty arg*
return 0;
}
So, as we see macro String turned into "1 , 2", that means macro comma had been unpacked first. However, macro Separate turned into 1 , 2 + **empty arg**, that means macro comma hadn't been unpacked first and I wonder why? I tried this in VS2019.

#define Quote(x) #x
#define String(x) Quote(x)
#define SeparateImpl(first, second) first + second
#define Separate(pair) SeparateImpl(pair)
#define comma ,
Macro invocation proceeds as follows:
Argument substitution (a.s), where if a parameter is mentioned in the replacement list and said parameter does not participate in a paste or stringification, it is fully expanded and said mentions of the parameter in the replacement list are substituted with the result.
Stringification
Pastes
Rescan and further replacement (r.a.f.r.), where the resulting replacement list is rescanned, during which the macro's name is marked as invalid for expansion ("painted blue").
Here's how each case should expand:
Quote(1 comma 2)
a.s. no action (only mention of parameter is stringification). Stringification applies. Result: "1 comma 2".
String(1 comma 2)
a.s. applies; yielding Quote(1 , 2). During r.a.f.r., Quote identified as a macro, but the argument count doesn't match. This is invalid. But see below.
SeparateImpl(1 comma 2)
Invalid macro call. The macro is being invoked with one argument, but it should have 2. Note that comma being defined as a macro is irrelevant; at the level of macro invocation you're just looking at the tokens.
Separate(1 comma 2)
a.s. applies; yielding SeparateImpl(1 , 2). During r.a.f.r., SeparateImpl is invoked... that invocation's a.s. applies, yielding 1 + 2.
I tried this in VS2019.
I could tell from a glance it was VS something before 2020, where the walls tells me they're finally going to work on preprocessor compliance. VS in particular seems to have this strange state in which tokens with commas in them nevertheless are treated as single arguments (it's as if argument identification occurs before expansion but continues to apply or something); so in this case, 1 , 2 would be that strange thing in your String(1 comma 2) call; i.e., Quote is being called with 1 , 2 but in that case it's actually one argument.

How to force additional preprocessor macro scan

I have the following code:
#define FOO_BAR x
#define FOO(x) FOO_BAR
I do want FOO(2) to expand to 2, but I'm getting x instead. I tried to use EXPAND macro to force extra scan:
#define FOO_BAR x
#define EXPAND(x) x
#define FOO(x) EXPAND(FOO_BAR)
Note, this is intentional, that FOO_BAR doesn't accept x as an argument. Basically, I cannot pass x to FOO_BAR.
But it doesn't work as well. Any ideas?
I want this to work on any compiler (MSVC, gcc, clang).
What exactly I am trying to accomplish
My end goal is to create type safe enums for OpenGL. So, I need to do mapping from my safe enum to unsafe ones. So I have something like:
enum class my_enum {
foo,
bar
}
GLenum my_enum2gl(my_enum e) {
switch (e) {
case my_enum::foo: return GL_FOO;
case my_enum::bar: return GL_BAR;
}
return GL_NONE;
}
Since I'm lazy, I did some preprocessor magic. And implemented this as:
#define PP_IMPL_ENUM_VALUE(enum_pair) __PP_EVAL(__PP_IMPL_ENUM_VALUE enum_pair)
#define __PP_IMPL_ENUM_VALUE(cpp_enum, gl_enum) cpp_enum,
#define PP_IMPL_CONVERT(enum_pair) __PP_EVAL(__PP_IMPL_CONVERT enum_pair)
#define __PP_IMPL_CONVERT(cpp_enum, gl_enum) case name::cpp_enum: return gl_enum;
#define DEF_STATE_ENUM(name, ...) \
enum name { \
PP_FOR_EACH(PP_IMPL_ENUM_VALUE, ##__VA_ARGS__) \
}; \
namespace detail { \
GLenum name ## 2gl(name e) { \
switch(e) { \
__PP_EVAL(PP_FOR_EACH(PP_IMPL_CONVERT, ##__VA_ARGS__)) \
default: \
assert(!"Unknown value"); \
return GL_NONE; \
} \
} \
}
DEF_STATE_ENUM(my_enum,
(foo, GL_FOO),
(bar, GL_BAR)
)
The problem is that __PP_IMPL_CONVERT uses name which is not expanded. Passing x to FOO_BAR would mean that I'm passing some extra parameter to a functor for PP_FOR_EACH.

You need to understand
The preprocessor fully expands the arguments to each function-like macro before substituting them into the macro's expansion, except where they are operands of the # or ## preprocessing operators (in which case they are not expanded at all).
After modifying the input preprocessing token sequence by performing a macro expansion, the preprocessor automatically rescans the result for further macro expansions to perform.
In your lead example, then, given
#define FOO_BAR x
#define FOO(x) FOO_BAR
and a macro invocation
FOO(2)
, the preprocessor first macro expands the argument 2, leaving it unchanged, then replaces the macro call with its expansion. Since the expansion does not, in fact, use the argument in the first place, the initial result is
FOO_BAR
The preprocessor then rescans that, recognizes FOO_BAR as the identifier of an object-like macro, and replaces it with its expansion, yielding
x
, as you observed. This is the normal and expected behavior of a conforming C preprocessor, and to the best of my knowledge, C++ has equivalent specifications for its preprocessor.
Inserting an EXPAND() macro call does not help, because the problem is not failure to expand macros, but rather the time and context of macro expansion. Ultimately, it should not be all that surprising that when the replacement text of macro FOO(x) does not use the macro parameter x, the actual argument associated with that parameter has no effect on the result of the expansion.
I cannot fully address your real-world code on account of the fact that it depends centrally on a macro PP_FOR_EACH() whose definition you do not provide. Presumably that macro's name conveys the gist, but as you can see, the details matter. However, if you in fact understand how your PP_FOR_EACH macro actually works, then I bet you could come up with a variant that accepts an additional leading argument, by which you could convey (the same) name to each expansion.
Alternatively, this is the kind of problem for which X Macros were invented. I see that that alternative has already been raised in comments. You might even be able -- with some care -- to build a solution that uses X macros inside, so as to preserve the top-level interface you now have.

How do I combine BOOST_PP_IF with BOOST_PP_LPAREN?

I'm trying to conditionally expand a macro to either "( a" or "b )", but the naive way of doing so doesn't work on either of the compilers I'm using (Microsoft C/C++ and the NDK compiler). Example:
// This works on both compilers, expands to ( a ) as expected
#define PARENS_AND_SUCH BOOST_PP_IF(1, BOOST_PP_LPAREN() a BOOST_PP_RPAREN(), b)
// MSVC: syntax error/unexpected end of file in macro expansion
// NDK: unterminated argument list
#define PARENS_AND_SUCH BOOST_PP_IF(1, BOOST_PP_LPAREN() a, b)
// Desired expansion: ( a
// MSVC expansion: ( a, b )
// NDK: error: macro "BOOST_PP_IIF" requires 3 arguments, but only 2 given
#define PARENS_AND_SUCH BOOST_PP_IF(1, BOOST_PP_LPAREN() a, b BOOST_PP_RPAREN())
What am I doing wrong?

You could force the order of evaluation to conform to the expected one by abstracting out the branches of the IF to subdefinitions, and delay their expansion until the conditional returns a branch:
#define PARENS_AND_SUCH BOOST_PP_CAT(PAS_, BOOST_PP_IF(1, THEN, ELSE))
#define PAS_THEN BOOST_PP_LPAREN() a
#define PAS_ELSE b BOOST_PP_RPAREN()
Since THEN and ELSE aren't complete names, the branches will not be expanded before the IF is expanded; when it returns, the value is combined with PAS_ to form a new valid definition and will expand at that time.
You could also parameterise the THEN and ELSE macros and make this technique more general (and IMO more elegant): passing parameters to an incomplete name essentially forms a thunk, and works pretty much the same way (the incomplete function-like macro name will be passed around plus parameter list until it's completed).

How Do we Interpret This Complex C++ Preprocessor Macro Replacement

I am studying the C++ standard on how the C++ preprocessor handles macro substitution in detail (I need to implement a subset of the C++ preprocessor myself). And here is an example I created for my studying:
#define a x
#define x(x,y) x(x+a, y+1)
a(x(90, 80), a(1,2))
By asking VC++ 2010 to generate the preprocessor output file, I found that the above a(x(90, 80), a(1,2)) becomes this:
90(90+x, 80+1)(90(90+x, 80+1)+x, 1(1+x, 2+1)+1);
But how does the preprocessor come up with this output? The rules are too complicated to comprehend. Can someone explain all the steps the preprocessor has done to come up with such a result?

Old answer, order is not exact (see edit):
Let's start from your expression:
a(x(90, 80), a(1, 2))
Now, since we have #define a x, it gets expanded to:
x(x(90, 80), x(1, 2))
// ^^^^^^^^^ ^^^^^^^
// arg 'x' arg 'y'
We can apply the definition of x(x,y), which is #define x(x,y) x(x+a, y+1):
x(90, 80)(x(90, 80)+a, x(1, 2)+1)
There is another pass that will expand x(...). You can also notice that the +a that > was in the previous expression was expanded to +x:
90(90+a, 80+1)(90(90+a, 80+1)+x, 1(1+a, 2+1)+1)
// ^^
// expanded
Last: the +a that remains are expanded to +x:
90(90+x, 80+1)(90(90+x, 80+1)+x, 1(1+x, 2+1)+1)
// ^^ ^^ ^^
// expanded expanded expanded
I hope there are no errors.
Please note that your definition of x(x,y) is quite ambiguous (for humans): the macro name and a parameter share the same name. Note that even withouth that, macros are not recursive, so if you had
#define x(u,v) x(u+a, b+1)
it would not expand to something like
x(u+a+a+a+a, b+1+1+1+1)
This is because when the macro x is defined, its name is not 'available' to the inner macro definition.
Another small note: for gcc, the output is not exactly the same, as gcc add spaces between replaced tokens (but if you remove them it will be the same as msvc).
EDIT: from the comments of dyp, this order is not the exact one. In fact, parameters are expanded first and then substituted in the macro expression. The last part of the sentence is important: that means that the macro parameter list is not re-evalued. Think of it as: macro gets expanded with placeholders in lieu of the parameters, then arguments are expanded, and then the placeholders are replaced by their respective argument. So, in short, that is equivalent to what I explained before, but here is the right order (detailed operations):
> Expansion of a(x(90, 80), a(1, 2))
> Substitution of 'a' into 'x' (now: 'x(x(90, 80), a(1, 2))')
> Expansion of x(x(90, 80), a(1, 2)) [re-scan]
> Macro 'x(X, Y)' is expanded to 'X(X+a,Y+1)'
> Expansion of 'x(90,80)' (first argument)
> Macro 'x(X,Y)' is expanded to 'X(X+a,Y+1)'
> Argument '90' does not need expansion (ie, expanded to same)
> Argument '80' does not need expansion (ie, expanded to same)
> Substitution with 'X=90' and 'Y=80': '90(90+a, 80+1)'
> Re-scan of result (ignoring macro name 'x')
> Substitution of 'a' into 'x': '90(90+x, 80+1)'
> Expansion of 'a(1,2)' (second argument)
> Substitution of 'a' into 'x'
> Expansion of 'x(1,2)' [re-scan]
> Macro 'x(X,Y)' is expanded to 'X(X+a,Y+1)'
> Argument '1' does not need expansion (ie, expanded to same)
> Argument '2' does not need expansion (ie, expanded to same)
> Substitution with 'X=1' and 'Y=2': '1(1+a, 2+1)'
> Re-scan of result (ignoring macro name 'x')
> Substitution of 'a' into 'x': '1(1+x, 2+1)'
> Substitution with X='90(90+x, 80+1)' and Y='1(1+x, 2+1)'
Result: '90(90+x, 80+1)(90(90+x, 80+1)+a, 1(1+x, 2+1)+1)'
> Re-scan of result
> Substitution of 'a' into 'x'
Result: '90(90+x, 80+1)(90(90+x, 80+1)+x, 1(1+x, 2+1)+1)'
Last result is result of whole expansion:
90(90+x, 80+1)(90(90+x, 80+1)+x, 1(1+x, 2+1)+1)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js