Optional Parameters with C++ Macros
Why is the author of one of the messages in this thread use additional comma in the macro here?
#define PRINT_STRING_MACRO_CHOOSER(...) \
GET_4TH_ARG(__VA_ARGS__, PRINT_STRING_3_ARGS, \
PRINT_STRING_2_ARGS, PRINT_STRING_1_ARGS, )
This has been done so that GET_4TH_ARG will always be supplied with its vararg arguments (which is a requirement of the language).
For example, without it,
PRINT_STRING_MACRO_CHOOSER("Hello, World")
would expand to
GET_4TH_ARG("Hello, World", PRINT_STRING_3_ARGS, PRINT_STRING_2_ARGS, PRINT_STRING_1_ARGS)
rather than
GET_4TH_ARG("Hello, World", PRINT_STRING_3_ARGS, PRINT_STRING_2_ARGS, PRINT_STRING_1_ARGS,)
The first form does not provide any vararg arguments (and so would not be a valid call), where the second form does provide an empty vararg argument to GET_4TH_ARG.
From the C++ standard: [cpp.replace]/4:
If the identifier-list in the macro definition does not end with an ellipsis, the number of arguments (including those arguments consisting of no preprocessing tokens) in an invocation of a function-like macro shall equal the number of parameters in the macro definition. Otherwise, there shall be more arguments in the invocation than there are parameters in the macro definition (excluding the ...). ...
Related
I know that in expanding a function-like preprocessor macro, the # and ## tokens in the top-level substitution list essentially act "before" any macro expansions on the argument. For example, given
#define CONCAT_NO_EXPAND(x,y,z) x ## y ## z
#define EXPAND_AND_CONCAT(x,y,z) CONCAT_NO_EXPAND(x,y,z)
#define A X
#define B Y
#define C Z
then CONCAT_NO_EXPAND(A,B,C) is the pp-token ABC, and EXPAND_AND_CONCAT(A,B,C) is the pp-token XYZ.
But what if I want to define a macro that expands just some of its arguments before pasting? For example, I would like a macro that allows only the middle of three arguments to expand, then pastes it together with an exact unexpanded prefix and an exact unexpanded suffix, even if the prefix or suffix is the identifier of an object-like macro. That is, if again we have
#define MAGIC(x,y,z) /* What here? */
#define A X
#define B Y
#define C Z
then MAGIC(A,B,C) is AYC.
A simple attempt like
#define EXPAND(x) x
#define MAGIC(x,y,z) x ## EXPAND(y) ## z
results in an error 'pasting ")" and "C" does not give a valid preprocessing token". This makes sense (and I assume it's also producing the unwanted token AEXPAND).
Is there any way to get that sort of result using just standard, portable preprocessor rules? (No extra code-generating or -modifying tools.)
If not, maybe a way that works on most common implementations? Here Boost.PP would be fair game, even if it involves some compiler-specific tricks or workarounds under the hood.
If it makes any difference, I'm most interested in the preprocessor steps as defined in C++11 and C++17.
Here's a solution:
#define A X
#define B Y
#define C Z
#define PASTE3(q,r,s) q##r##s
#define MAGIC(x,y,z,...) PASTE3(x##__VA_ARGS__,y,__VA_ARGS__##z)
MACRO(A,B,C,)
Note that the invocation "requires" another argument (see below for why); but:
MACRO(A,B,C) here is compliant for C++20
MACRO(A,B,C) will "work" in many C++11/C++17 preprocessors (e.g., gnu/clang), but that is an extension not a C++11/C++17 compliant behavior
I know that in expanding a function-like preprocessor macro, the # and ## tokens in the top-level substitution list essentially act "before" any macro expansions on the argument.
To be more precise, there are four steps to macro expansion:
argument identification
argument substitution
stringification and pasting (in an unspecified order)
rescan and further replacement
Argument identification associates parameters in the macro definition with arguments in an invocation. In this case, x associates with A, y with B, z with C, and ... with a "placemarker" (abstract empty value associated with a parameter whose argument has no tokens). For C++ preprocessors up to C++20, use of a ... requires at least one parameter; since C++20's addition of the __VA_OPT__ feature, use of the ... in an invocation is optional.
Argument substitution is the step where arguments are expanded. Specifically, what happens here is that for each parameter in the macro's replacement list (here, PASTE3(x##__VA_ARGS__,y,__VA_ARGS__##z)), where said parameter does not participate in a paste or stringification, the associated argument is fully expanded as if it appeared outside of an invocation; then, all mentions of that parameter in the replacement list that do not participate in stringification and paste are replaced with the expanded result. For example, at this step for the MAGIC(A,B,C,) invocation, y is the only mentioned qualifying parameter, so B is expanded producing Y; at that point we get PASTE3(x##__VA_ARGS__,Y,__VA_ARGS__##z).
The next step applies pastes and stringification operators in no particular order. Placemarker's are needed here specifically because you want to expand the middle and not the end, and you don't want extra stuff; i.e., to get A to not expand to X, and to stay A (as opposed to changing to "A"), you need to avoid argument substitution specifically. a.s. is avoided in only two ways; pasting or stringifying, so if stringification doesn't work we have to paste. And since you want that token to stay the same as what you had, you need to paste to a placemarker (which means you need one to paste to, which is why there's another parameter).
Once this macro applies the pastes to the "placemarkers", you wind up with PASTE3(A,Y,C); then there is the rescan and further replacement step, during which PASTE3 is identified as a macro invocation. Fast forwarding, since PASTE3 pastes its arguments, a.s. doesn't apply to any of them, we do the pastes in "some order" and we wind up with AYC.
As a final note, in this solution I'm using a varying argument to produce the placemarker token precisely because it allows invocations of the form MACRO(A,B,C) in at least C++20. I'm left-pasting that to z because that makes the addition at least potentially useful for something else (MAGIC(A,B,C,_) would use _ as a "delimiter" to produce A_Y_C).
This piece of code compiles in Visual Studio 2015, but not in Clang:
#define COMMA ,
#define MC(a) a
#define MA(a,b,c) MC(a b c)
map <MA(int,COMMA,int)> FF;
It appears that Clang expands the COMMA macro before submitting it to the MC() macro. "Who is right" according to the C++ standard? Also, how can I make Clang behave like Visual Studio?
EDIT: Simplified the example, and changed some macro names.
Clang conforms to the standard; Visual Studio doesn't. I think you will have a lot of trouble getting Clang to not conform to the standard, so I won't attempt to answer "how do I get Clang to act like Visual Studio?". Maybe that wasn't really what you wanted to know.
When the compiler identifies the invocation of a function-like macro (that is, a macro with parameters) it expands the macro using the procedure explained in detail in §16.3 [cpp.replace] of the C++ standard. In the following, I've simplified the procedure by not considering the # and ## operators, because they do not appear in your example and the full procedure is more complicated.
We'll examine the invocation MC(int, COMMA, int). Here's what happens after the compiler sees the tokens MC and (, which indicate an invocation of the macro.
The compiler identifies what the arguments are, which involves finding the closing parenthesis. There are three arguments, which corresponds to the number of parameters, so that's OK. The arguments have not yet been expanded, so the compiler only sees the punctuation actually in the source file. It identifies the arguments as int, COMMA and int.
Every argument (except the ones whose corresponding parameter participates in token concatenation or stringification -- but, as I said, I'm not going to go into that scenario here) are then fully expanded. This happens before they are substituted into the macro body, so that the names of the macro's parameters don't leak out of the macro. So now the three arguments are int, , and int.
A copy of the macro body is made, in which each parameter is substituted with the corresponding (fully expanded) argument. The macro body ("replacement list", in standardese) was MC(A B C); after substituting the arguments, that becomes MC(A , C).
The sequence of tokens created in step 3 is inserted into the input in place of the macro invocation, and preprocessing continues.
At this point, the compiler will see the invocation of the function-like macro MC(A, B), and will proceed as above. However, this time the first step fails because two arguments are identified but the macro MC only has one parameter.
I sometimes deliberately omit macro arguments. For example, for a function-like macro like
#define MY_MACRO(A, B, C) ...
I might call it as:
MY_MACRO(, bar, baz)
There are still technically 3 arguments; it's just that the first one is "empty". This question is not about variadic macros.
When I do this I get warnings from g++ when compiling with -ansi (aka -std=c++98), but not when I use -std=c++0x. Does this mean that empty macro args are legal in the new C++ standard?
That's the entirety of my question, but anticipating the "why would you want to?" response, here's an example. I like keeping .h files uncluttered by function bodies, but implementing simple accessors outside of the .h file is tedious. I therefore wrote the following macro:
#define IMPLEMENT_ACCESSORS(TEMPLATE_DECL, RETURN_TYPE, CLASS, FUNCTION, MEMBER) \
TEMPLATE_DECL \
inline RETURN_TYPE* CLASS::Mutable##FUNCTION() { \
return &MEMBER; \
} \
\
TEMPLATE_DECL \
inline const RETURN_TYPE& CLASS::FUNCTION() const { \
return MEMBER; \
}
This is how I would use it for a class template that contains an int called int_:
IMPLEMENT_ACCESSORS(template<typename T>, int, MyTemplate<T>, Int, int_)
For a non-template class, I don't need template<typename T>, so I omit that macro argument:
IMPLEMENT_ACCESORS(, int, MyClass, Int, int_)
If I understand correctly, empty macro argument is allowed since C99 and
C++0x(11).
C99 6.10.3/4 says:
... the number of arguments (including those arguments consisting of
no preprocessing tokens) shall equal the number of parameters ...
and C++ N3290 16.3/4 has the same statement, while C++03 16.3/10 mentions:
... any argument consists of no preprocessing tokens, the behavior is
undefined.
I think empty argument comes under the representation arguments consisting of
no preprocessing tokens above.
Also, 6.10.3 in Rationale for International Standard Programming Languages C rev. 5.10
says:
A new feature of C99: Function-like macro invocations may also now
have empty arguments, that is, an argument may consist of no
preprocessing tokens.
Yes. The relevant bit is 16.3/11
The sequence of preprocessing tokens bounded by the outside-most
matching parentheses forms the list of arguments for the function-like
macro. The individual arguments within the list are separated by comma
preprocessing tokens.
There's no requirement that a single argument corresponds to precisely one token. In fact, the following section makes it clear that there can be more than one token per argument:
Before being substituted, each argument’s preprocessing tokens are
completely macro replaced as if they formed the rest of the
preprocessing file
In your case, one argument happens to correspond to zero tokens. That doesn't cause any contradiction.
[edit]
This was changed by N1566 to bring C++11 in line with C99.
When I do that I normally put a comment in place of the argument.
Place a macro that will be expanded to the empty string.
#define NOARG
...
MY_MACRO(/*Ignore this Param*/ NOARG, bar, baz)
PS. I got no warning with g++ with or without the -std=c++98 flag.
g++ (Ubuntu 4.4.3-4ubuntu5) 4.4.3
g++ (Apple Inc. build 5666) 4.2.1
#define LINK_ENTITY_TO_CLASS(mapClassName,DLLClassName) \
static CEntityFactory<DLLClassName> mapClassName( #mapClassName );
This is a macro from the Alien Swarm mod for Half-Life 2, meant to be compiled with MSVC.
I've never seen an argument preceded by a # in a macro before, and I'm not sure if this is a MSVC specific thing or just uncommon. What does it mean?
This is part of both standard C and C++ and is not implementation-specific. The # preprocessing operator stringizes its argument. It takes whatever tokens were passed into the macro for the parameter designated by its operand (in this case, the parameter mapClassName) and makes a string literal out of them. So, for a simple example,
#define STRINGIZE(x) # x
STRINGIZE(Hello World)
// gets replaced with
"Hello World"
Note that the argument tokens are not macro replaced before they are stringized, so if Hello or World were defined as a macro, the result would still be the same. You need to use an extra level of indirection to get the arguments macro replaced (that linked answer discusses the concatenation operator, ##, but applies equally to the stringization operator.
Is it possible to call function-like-macros with less that all the parameters in linux?
Actually doing this only generates a warning in Visual Studio (warning 4003) and unassigned variables replaces with "".
But compiling it using g++ generates an error in linux ("error: macro *** requires ** arguments, but only ** given").
Is there any possible way to disable this or overcome it?
The number of arguments in a macro invocation must exactly match the number of parameters in the macro definition. So, no, you cannot invoke a macro with fewer arguments than it has parameters.
To "overcome" it, you can define multiple differently named macros with different numbers of parameters.
C++0x (which is not yet standard, but which your compiler might partially support) adds support for variadic macros which can be called with a variable number of arguments.
The standard (§16.3 - Macro replacement) is clear that you have to pass the same number of arguments:
"If the identifier-list in the macro
definition does not end with an
ellipsis, the number of arguments
(including those arguments consisting
of no preprocessing tokens) in an
invocation of a function-like macro
shall equal the number of parameters
in the macro definition."
I don't know of any g++ option to override this.