Weird behavior of variadic macro expansion with gcc and clang

Weird behavior of variadic macro expansion with gcc and clang - c++

I'm writing a variadic dispatcher macro in C++, to call a different macro based on the number of arguments (from none up to 5) provided to the dispatcher. I came up with this solution:
#define GETOVERRIDE(_ignored, _1, _2, _3, _4, _5, NAME, ...) NAME
#define NAMEDARGS(...) GETOVERRIDE(ignored, ##__VA_ARGS__, NAMEDARGS5, NAMEDARGS4, NAMEDARGS3, NAMEDARGS2, NAMEDARGS1, NAMEDARGS0)(__VA_ARGS__)
NAMEDARGS is the dispatcher macro; calling it with 1 argument will result in a call to NAMEDARGS1 which takes 1 argument, and so on (I don't provide the implementations of the various NAMEDARGS# since they are irrelevant in this context).
I tested the code gcc 7.1.1, and I found a weird behavior of the gcc expansion when using the -std=c++14 flag. With this test code:
NAMEDARGS()
NAMEDARGS(int)
NAMEDARGS(int, float)
I get these expansions:
$ gcc -E testMacro.cpp
NAMEDARGS0()
NAMEDARGS1(int)
NAMEDARGS2(int, float)
$ gcc -E -std=c++14 testMacro.cpp
NAMEDARGS1()
NAMEDARGS1(int)
NAMEDARGS2(int, float)
It seems that using the -std=c++14 flag the substitution of the zero-argument call fails, resulting in the call of the one-argument macro. I thought that this could be because the ##__VA_ARGS__ syntax is a GNU extension, thus not working with an ISO C++ preprocessor; however, when trying with clang 4.0.1 I obtain the desired expansion:
$ clang -E -std=c++14 testMacro.cpp
NAMEDARGS0()
NAMEDARGS1(int)
NAMEDARGS2(int, float)
So I don't understand what's going on here. Does clang implement this gnu extension, accepting non-ISO code also with -std==c++14 unlike gcc? Or maybe the problem lies elsewhere? Thanks for the help.

GCC defaults -std to gnu++14 (see here), which is C++14 with GNU extensions.
Comparing the two with only NAMEDARGS(...) defined shows how the expansions differ:
Code
#define NAMEDARGS(...) GETOVERRIDE(ignored, ##__VA_ARGS__, NAMEDARGS5, NAMEDARGS4, NAMEDARGS3, NAMEDARGS2, NAMEDARGS1, NAMEDARGS0)(__VA_ARGS__)
NAMEDARGS()
-std=gnu++14 -E
GETOVERRIDE(ignored, NAMEDARGS5, NAMEDARGS4, NAMEDARGS3, NAMEDARGS2, NAMEDARGS1, NAMEDARGS0)()
-------------------^
-std=c++14 -E
GETOVERRIDE(ignored,, NAMEDARGS5, NAMEDARGS4, NAMEDARGS3, NAMEDARGS2, NAMEDARGS1, NAMEDARGS0)()
-------------------^^
I'm not an experienced standard reader, but I found the following two passages in [cpp.replace] which suggest that GCC is correct in both invocations:
If the identifier-list in the macro definition does not end with an ellipsis, the number of arguments (including those arguments consisting of no preprocessing tokens) in an invocation of a function-like macro shall equal the number of parameters in the macro definition. Otherwise, there shall be more arguments in the invocation than there are parameters in the macro definition (excluding the ...). There shall exist a ) preprocessing token that terminates the invocation.
...
If there is a ... immediately preceding the ) in the function-like macro definition, then the trailing arguments, including any separating comma preprocessing tokens, are merged to form a single item: the variable arguments. The number of arguments so combined is such that, following merger, the number of arguments is one more than the number of parameters in the macro definition (excluding the ...).
It seems correct then that an empty __VA_ARGS__ is expanded to a single empty argument.
I can't find whether clang's behaviour here is intended.

Related

How to resolve "must specify at least one argument for '...' parameter of variadic macro"

You can define a variadic macro in C++ like:
#define FOO(x, ...) bar(x, __VA_ARGS__)
But calling FOO as FOO(1) results in the macro expansion bar(1,) which is obviously a syntactical error and won't compile.
Therefore GCC includes a GNU extension:
#define FOO(x, ...) bar(x, ##__VA_ARGS__)
which would expand the given example to the desired result bar(1). Although __VA_ARGS__ is a GNU extension it's support by clang too, but which emits a warning under the -pedantic flag:
warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments].
Therefore C++20 includes a new mechanism to achieve the desired result in a standard compliant way:
#define FOO(x, ...) bar(x __VA_OPT__(,) __VA_ARGS__)
This will add the , only if the following __VA_ARGS__ are not empty, otherwise it will omit the ,. This new extension currently works with the GCC and clang trunks (with the -std=c++2a flag enabled): https://godbolt.org/z/k2nAE6.
My only problem is that clang emits a warning under -pedantic:
warning: must specify at least one argument for '...' parameter of variadic macro [-Wgnu-zero-variadic-macro-arguments] (GCC does not emit a warning).
But why? This only seems to make sense if someone only uses __VA_ARGS__ and passes no arguments to the macro. But with the new extension __VA_OPT__ I explicitly handle the case for which no argument is given.
So why would clang emit a warning in this case and how can I work around it?

This is already legal in C++20; it appears that Clang just hasn't updated their warnings yet.
The C++20 standard (from N4868) says in [cpp.replace.general]/5:
If the identifier-list in the macro definition does not end with an ellipsis [...] Otherwise, there shall be at least as many arguments in the invocation as there are parameters in the macro definition (excluding the ...). There shall exist a ) preprocessing token that terminates the invocation.
Compare with the bolded section with the equivalent statement in C++17's [cpp.replace]/4 (from N4659):
If the identifier-list in the macro definition does not end with an ellipsis [...] Otherwise, there shall be more arguments in the invocation than there are parameters in the macro definition (excluding the ...). There shall exist a ) preprocessing token that terminates the invocation.
There is a similar comparison to be made between C++20's [cpp.replace.general]/15 and C++17's [cpp.replace]/12.
That is, C++17 had a requirement for FOO(x, ...) to be passed at least two arguments; C++20 has weakened that to only require one. Clang's -pedantic doesn't seem to have caught up yet.

Macro expansion order confusion between compilers

This piece of code compiles in Visual Studio 2015, but not in Clang:
#define COMMA ,
#define MC(a) a
#define MA(a,b,c) MC(a b c)
map <MA(int,COMMA,int)> FF;
It appears that Clang expands the COMMA macro before submitting it to the MC() macro. "Who is right" according to the C++ standard? Also, how can I make Clang behave like Visual Studio?
EDIT: Simplified the example, and changed some macro names.

Clang conforms to the standard; Visual Studio doesn't. I think you will have a lot of trouble getting Clang to not conform to the standard, so I won't attempt to answer "how do I get Clang to act like Visual Studio?". Maybe that wasn't really what you wanted to know.
When the compiler identifies the invocation of a function-like macro (that is, a macro with parameters) it expands the macro using the procedure explained in detail in §16.3 [cpp.replace] of the C++ standard. In the following, I've simplified the procedure by not considering the # and ## operators, because they do not appear in your example and the full procedure is more complicated.
We'll examine the invocation MC(int, COMMA, int). Here's what happens after the compiler sees the tokens MC and (, which indicate an invocation of the macro.
The compiler identifies what the arguments are, which involves finding the closing parenthesis. There are three arguments, which corresponds to the number of parameters, so that's OK. The arguments have not yet been expanded, so the compiler only sees the punctuation actually in the source file. It identifies the arguments as int, COMMA and int.
Every argument (except the ones whose corresponding parameter participates in token concatenation or stringification -- but, as I said, I'm not going to go into that scenario here) are then fully expanded. This happens before they are substituted into the macro body, so that the names of the macro's parameters don't leak out of the macro. So now the three arguments are int, , and int.
A copy of the macro body is made, in which each parameter is substituted with the corresponding (fully expanded) argument. The macro body ("replacement list", in standardese) was MC(A B C); after substituting the arguments, that becomes MC(A , C).
The sequence of tokens created in step 3 is inserted into the input in place of the macro invocation, and preprocessing continues.
At this point, the compiler will see the invocation of the function-like macro MC(A, B), and will proceed as above. However, this time the first step fails because two arguments are identified but the macro MC only has one parameter.

How to make GCC reduce 'arg,##__VA_ARGS__' to 'arg' to use it as single macro argument?

I have these macros defined for visual studio and clang and they both compile fine
#if defined(_MSC_VER)
# define _declare_func(...) PP_CAT(PP_CAT(_declare_func_, PP_NARG(__VA_ARGS__)),(__VA_ARGS__))
# define declare_func(...) _declare_func PP_LEFT_PAREN notused,##__VA_ARGS__ PP_RIGHT_PAREN
#else // clang version
# define _declare_func(...) PP_CAT(_declare_func_, PP_NARG(__VA_ARGS__))(__VA_ARGS__)
# define declare_func(...) _declare_func ( notused,##__VA_ARGS__ )
#endif
#define _declare_func_1(notused) void my_function()
#define _declare_func_2(notused, scope) void scope::my_function()
class MyClass
{
declare_func();
};
declare_func(MyClass) { }
PP_CAT is a classic multilevel concat macro
PP_NARG counts the number of macro arguments
PP_LEFT_PAREN and PP_RIGHT_PAREN reduce to '(' and ')'
Is there any way to achieve this with GCC ? ( I tried both macro version with GCC 5.2, both fail to compile because the comma seem to be propagated during macro resolution and removed only at the end of preprocessing, making PP_NARG always reduce to '2' and never '1')
Thanks !

From doc:
Second, the ‘##’ token paste operator has a special meaning when placed between a comma and a variable argument. If you write
#define eprintf(format, ...) fprintf (stderr, format, ##__VA_ARGS__)
and the variable argument is left out when the eprintf macro is used, then the comma before the ‘##’ will be deleted. This does not happen if you pass an empty argument, nor does it happen if the token preceding ‘##’ is anything other than a comma.
eprintf ("success!\n")
==> fprintf(stderr, "success!\n");
The above explanation is ambiguous about the case where the only macro parameter is a variable arguments parameter, as it is meaningless to try to distinguish whether no argument at all is an empty argument or a missing argument. In this case the C99 standard is clear that the comma must remain, however the existing GCC extension used to swallow the comma. So CPP retains the comma when conforming to a specific C standard, and drops it otherwise.
So for
#define declare_func(...) _declare_func ( notused,##__VA_ARGS__ )
the comma remain in C standard, you may use -std=gnu99 or -std=gnu++11 to drop the comma and have your working macro.
Demo
To make your macro works with -std=c++11, you have so to force to have at least one argument.

Are empty macro arguments legal in C++11?

I sometimes deliberately omit macro arguments. For example, for a function-like macro like
#define MY_MACRO(A, B, C) ...
I might call it as:
MY_MACRO(, bar, baz)
There are still technically 3 arguments; it's just that the first one is "empty". This question is not about variadic macros.
When I do this I get warnings from g++ when compiling with -ansi (aka -std=c++98), but not when I use -std=c++0x. Does this mean that empty macro args are legal in the new C++ standard?
That's the entirety of my question, but anticipating the "why would you want to?" response, here's an example. I like keeping .h files uncluttered by function bodies, but implementing simple accessors outside of the .h file is tedious. I therefore wrote the following macro:
#define IMPLEMENT_ACCESSORS(TEMPLATE_DECL, RETURN_TYPE, CLASS, FUNCTION, MEMBER) \
TEMPLATE_DECL \
inline RETURN_TYPE* CLASS::Mutable##FUNCTION() { \
return &MEMBER; \
} \
\
TEMPLATE_DECL \
inline const RETURN_TYPE& CLASS::FUNCTION() const { \
return MEMBER; \
}
This is how I would use it for a class template that contains an int called int_:
IMPLEMENT_ACCESSORS(template<typename T>, int, MyTemplate<T>, Int, int_)
For a non-template class, I don't need template<typename T>, so I omit that macro argument:
IMPLEMENT_ACCESORS(, int, MyClass, Int, int_)

If I understand correctly, empty macro argument is allowed since C99 and
C++0x(11).
C99 6.10.3/4 says:
... the number of arguments (including those arguments consisting of
no preprocessing tokens) shall equal the number of parameters ...
and C++ N3290 16.3/4 has the same statement, while C++03 16.3/10 mentions:
... any argument consists of no preprocessing tokens, the behavior is
undefined.
I think empty argument comes under the representation arguments consisting of
no preprocessing tokens above.
Also, 6.10.3 in Rationale for International Standard Programming Languages C rev. 5.10
says:
A new feature of C99: Function-like macro invocations may also now
have empty arguments, that is, an argument may consist of no
preprocessing tokens.

Yes. The relevant bit is 16.3/11
The sequence of preprocessing tokens bounded by the outside-most
matching parentheses forms the list of arguments for the function-like
macro. The individual arguments within the list are separated by comma
preprocessing tokens.
There's no requirement that a single argument corresponds to precisely one token. In fact, the following section makes it clear that there can be more than one token per argument:
Before being substituted, each argument’s preprocessing tokens are
completely macro replaced as if they formed the rest of the
preprocessing file
In your case, one argument happens to correspond to zero tokens. That doesn't cause any contradiction.
[edit]
This was changed by N1566 to bring C++11 in line with C99.

When I do that I normally put a comment in place of the argument.
Place a macro that will be expanded to the empty string.
#define NOARG
...
MY_MACRO(/*Ignore this Param*/ NOARG, bar, baz)
PS. I got no warning with g++ with or without the -std=c++98 flag.
g++ (Ubuntu 4.4.3-4ubuntu5) 4.4.3
g++ (Apple Inc. build 5666) 4.2.1

Calling a C++ macro with fewer arguments

Is it possible to call function-like-macros with less that all the parameters in linux?
Actually doing this only generates a warning in Visual Studio (warning 4003) and unassigned variables replaces with "".
But compiling it using g++ generates an error in linux ("error: macro *** requires ** arguments, but only ** given").
Is there any possible way to disable this or overcome it?

The number of arguments in a macro invocation must exactly match the number of parameters in the macro definition. So, no, you cannot invoke a macro with fewer arguments than it has parameters.
To "overcome" it, you can define multiple differently named macros with different numbers of parameters.
C++0x (which is not yet standard, but which your compiler might partially support) adds support for variadic macros which can be called with a variable number of arguments.

The standard (§16.3 - Macro replacement) is clear that you have to pass the same number of arguments:
"If the identiﬁer-list in the macro
deﬁnition does not end with an
ellipsis, the number of arguments
(including those arguments consisting
of no preprocessing tokens) in an
invocation of a function-like macro
shall equal the number of parameters
in the macro deﬁnition."
I don't know of any g++ option to override this.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js