C++ preprocessors are not aware of template arguments? - c++

As it appears, C++ preprocessor fails if a template instantiation with multiple arguments passed to a macro as an argument.
See an example below.
#include <stdio.h>
#define FOO(v) printf("%d\n",v::val())
template<int N>
struct bar {
static int val() { return N; }
};
template<int N, int M>
struct baz {
static int val() { return N+M; }
};
int main() {
printf("%d\n",bar<1>::val());
printf("%d\n",baz<1,2>::val());
FOO(bar<10>); // OK
FOO(baz<20,30>); // error: too many arguments provided to function-like macro invocation
FOO((baz<20,30>)); // error: '::val' has not been declared
}
Tested with clang++ and g++
Should it be considered as a bug?

No, it's not a bug.
The c preprocessor is a different beast from the rest of the language and it plays by its own rules. Changing this would break compatibility in a massive way, CPP is highly rigorously standardized.
The usual way to work around these comma issues is,
typedef baz<20,30> baz2030_type;
FOO(baz2030_type);

The C/C++ preprocessor recognizes commas as macro argument separators unless they are nested inside parentheses. Just parentheses. Brackets, braces and template markers don't count:
The individual arguments within the list are separated by comma preprocessing tokens, but comma preprocessing tokens between matching inner parentheses do not separate arguments. (C++14 §16.3/11; C11 §6.10.3/11)
(A side effect of the above is that you can use unbalanced braces and brackets as macro arguments. That's usually not a very good idea, but you can do it if you have to.)
Problems occasionally crop up as a result; a common one is unwanted multiple arguments when the argument is supposed to be a block of code:
MY_FANCY_MACRO(1000, { int i=0, j=42; ... })
Here, the macro is called with (at least) 3 arguments, although it was probably written to accept 2.
With modern C++ (and C) compilers, you have a few options. In a fairly subjective order:
Rewrite the macro as an inline function. If the argument is a code block, consider using a templated function which could accept a lambda or other functor. If the argument is a type, make it a template argument instead.
If surrounding the argument with redundant parentheses is syntactically valid, do that. But in such a case it is almost certainly the case that suggestion (1) above would have worked.
Define:
#define COMMA ,
and use it where necessary:
FOO(baz<20 COMMA 30>);
This doesn't require modifying the macro definition in any way, but it will fail if the macro passes the argument to another macro. (The replacement will be done before the inner macro call is parsed, so the multiple argument problem will just be deferred to the inner call.)
If you expect that one macro argument might contain unprotected commas, and it is the last or only argument, and you're in a position to modify the macro, and you're using C++11/C99 or better (or gcc, which has allowed this as an extension for some time), make the macro variadic:
#define FOO(...) printf("%d\n",__VA_ARGS__::val())

The macro's argument is treated as plain text string and the arguments are separated using commas. Hence the comma in the template will be treated as a delimiter. Thus the preprocessor will think that you have passed on two arguments to a single argument macro, hence the error.

BOOST_IDENTITY_TYPE is the solution for that: https://www.boost.org/doc/libs/1_73_0/libs/utility/identity_type/doc/html/index.html
You can also wrap the type into decltype: decltype(std::pair<int, int>()) var; which also adds an extra parentheses, but this unfortunately does an extra ctor call.

Related

How exactly is expansion of a parameter pack evaluated with std::forward?

I wanted to better understand parameter pack expansions, so I decided to research a bit and, what once seemed obvious to me, stopped being so obvious after trying to understand what exactly is going on. Let's examine a standard parameter pack expansion with std::forward:
template <typename... Ts>
void foo(Ts&& ... ts) {
std::make_tuple(std::forward<Ts>(ts)...);
}
My understanding here is that for any parameter pack Ts, std::forward<Ts>(ts)... will result in a comma-separated list of forwarded arguments with their corresponding type, e.g., for ts equal 1, 1.0, '1', the function body will be expanded to:
std::make_tuple(std::forward<int&&>(1), std::forward<double&&>(1.0), std::forward<char&&>('1'));
And that makes sense to me. Parameter pack expansion, used with a function call, results in a comma-separated list of calls to that function with appropriate arguments.
What seems to be bothering me is why then would we sometimes need to introduce the comma operator (operator,), if we want to call a bunch of functions in a similar manner? Seeing this answer, we can read this code:
template<typename T>
static void bar(T t) {}
template<typename... Args>
static void foo2(Args... args) {
(bar(args), ...); // <- notice: comma here
}
int main() {
foo2(1, 2, 3, "3");
return 0;
}
followed by information that it will result in the following expansion:
(bar(1), bar(2), bar(3), bar("3"));
Fair, makes sense, but... why? Why doing this, instead:
template<typename... Args>
static void foo2(Args... args) {
(bar(args)...); // <- notice: no comma here
}
doesn't work? According to my logic ("Parameter pack expansion, used with a function call, results in a comma-separated list of calls to that function with appropriate arguments"), it should expand into:
(bar(1), bar(2), bar(3), bar("3"));
Is it because of bar() returning void? Well, changing bar() to:
template<typename T>
static int bar(T t) { return 1; }
changes nothing. I would imagine that it would just expand to a comma-separated list of 1s (possibly doing some side effects, if bar() was designed as such). Why does this behave differently? Where is my logic flawed?
My understanding here is that for any parameter pack Ts, std::forward<Ts>(ts)... will result in a comma-separated list of forwarded arguments with their corresponding type
Well, there's your problem: that's not how it works. Or at last, not quite.
Parameter pack expansions and their nature are determined by where they are used. Pack expansions, pre-C++17, can only be used within certain grammatical constructs, like a braced-init-list or a function call expression. Outside of such constructs (the previous list is not comprehensive), their use simply is not allowed. Post-C++17, fold expressions allow them to be used across specific operators.
The reason for this is in part grammatical. Consider this: bar(1, (2, 3), 5). This calls the function with 3 arguments; the expression (2, 3) resolves down to a single argument. That is, there is a difference between the comma used in an expression and the comma used as a separator between values to be used in a function call. This difference is made at the grammatical level; if I want to invoke the comma operator in the middle of a sequence of function arguments, I have to put that whole thing in () so that the compiler will recognize the comma as a comma expression operator, not a comma separator.
Non-fold pack expansions effectively expand to use the separation comma, not the expression comma. As such, they can only be expanded in places where the separation kind of comma is valid.
The reason (bar(args)...) doesn't work is because a () expression cannot take the second kind of comma.

Is it legal for a function-like macro to "steal" commas from a parenthesized template argument list?

I was just surprised that providing a type with two template arguments to a function-like macro resulted in the compiler complaining.
This (conceptually similar) example code:
template<typename T> struct foo{};
template<typename T, U> struct bar{};
#define p(x) printf("sizeof(" #x ") = %u\n", sizeof(x));
int main()
{
p(foo<int>); // works, of course
p(bar<int,int>); // does not work
p((bar<int,int>)); // does not work either
return 0;
}
makes GCC (6.2.0) complain macro "p" passed 2 arguments, but takes just 1.
Well, of course, the preprocessor is a preprocessor doing text replacement, it's not a real C++ compiler which understands templates or all other rules of the language.
Maybe I'm asking too much by expecting that the preprocessor recognizes the angle brackets, granted... but at least parentheses are explicitly mentioned in the specification.
16.3 (paragraphs 10 to 12) states outermost parentheses delimiting the bounded sequence of tokens. The word "outermost" suggests that there may possibly also be further (not-outermost) parentheses which the preprocessor recognizes.
Also, it explicitly states "skipping intervening matched pairs of left and right parenthesis" as well as "comma preprocessing tokens between matching inner parentheses do not separate arguments" -- which means that if I am reading correctly, then at least the last line should in my understanding pass.
What am I understanding wrong?
p((bar<int,int>)) is a valid invocation of the p macro with a single macro argument, which is (bar<int,int>). Your understanding so far is correct.
Unfortunately, its expansion includes sizeof((bar<int,int>)), and sizeof does not accept doubly-parenthesised types.
Variadic macros (C++11) work well here as an alternative.
#define p(...) printf("sizeof(" #__VA_ARGS__ ") = %u\n", sizeof(__VA_ARGS__));

Comma omitted in variadic function declaration in C++

I am used to declaring variadic functions like this:
int f(int n, ...);
When reading The C++ Programming Language I found that the declarations in the book omit the comma:
int f(int n...); // the comma has been omitted
It seems like this syntax is C++ specific as I get this error when I try to compile it using a C compiler:
test.c:1:12: error: expected ‘;’, ‘,’ or ‘)’ before ‘...’ token
int f(int n...);
Is there any difference between writing int f(int n, ...) and int f(int n...)?
Why was this syntax added C++?
According to § 8.3.5.4 of the C++ standard (current draft):
Where syntactically correct and where “...” is not part of
an abstract-declarator, “, ...” is synonymous with “...”.
In short, in C++ ... (ellipsis) is an operator in its own right and so can be used without the comma, but use of the comma is retained for backwards compatibility.
Currently, both of these declarations have the same meaning:
int f(int n, ...);
int f(int n ...);
This leads to an issue where the following two declarations are both legal, yet have wildly different meanings:
template <class... T> void f(T...); // function template with parameter pack
template <class T> void f(T...); // variadic function
Once C++11 introduced variadic templates, it is much more likely that the second declaration is a programmer error rather than lazily omitting the comma. As a result, there was a proposal to remove the latter from the language (P0281), but it was apparently rejected.
With int f(int n, ...); and int f(int n...);, as you can see, both , ... and ... has the same meaning. Comma is optional.
But this int printz(...); is valid in C++ while int printz(,...); is not (at least one named parameter must appear before the ellipsis parameter). That's why you can have just (...), even though the arguments passed to such function are not accessible.
As I recall from the time, C++ indeed defined variadic function signatues as you note. Later, the rapidly evolving C language (on the journey from K&R to ANSI) introduced prototypes or new-style function declarations that also declared parameters inside parens after the function name. But, with two notable differences: the comma before the ellipses, and the need for the abomination of (void) to indicate an empty parameter list (to preserve backward compatibility of the empty parens as an old style declaration).
Looking through my archives, I find The C++ Programming Language original edition "reprinted with corrections July 1987" shows:
argument-declaration-list: arg-declaration-listopt ...opt
arg-declaration-list: arg-declaration-list , argument-declaration
argument-declaration
There is no form to accept the now-optional comma. Note that the arg-declaration-list is a comma-separated and this doesn't hang out to provide a comma after the list and before the next (different) thing.
This is the most natural way to write this. If you want a comma, you need explicitly , ... in the first production, as two distinct (possibly whitespace separated) tokens.
As C's efforts to proper standardization progressed, C++ compilers started accepting the C versions as well to allow easy use of the C standard header files.
Why did the C designers add the comma when it implies a less sensible grammatical role of the ellipses as a fake parameter placeholder? I never found out.

Overloading function calls for compile-time constants

I'm interested to know whether one can distinguish between function calls using arguments provided by compile-time constants and those without?
For example:
int a = 2;
foo( a ) // #1: Compute at run-time
foo( 3 ) // #2: Compute at compile-time
Is there any way to provide overloads that distinguish between these two cases? Or more generally, how do I detect the use of a literal type?
I've looked into constexpr, but a function parameter cannot be constexpr. It would be neat to have the same calling syntax, but be able to generate different code based on the parameters being literal types or not.
You cannot distinguish between a compile-time literal int and a run-time variable int. If you need to do this, you can provide an overload that can only work at compile-time:
void foo(int ); // run-time
template <int I>
void foo(std::integral_constant<int, I> ); // compile-time
I think the above answers somehow miss the point that the question was trying to make.
Is there any way to provide overloads that distinguish between these two cases? Or more generally, how do I detect the use of a literal type?
this is what a 'rvalue reference' is for. literal type is a rvalue.
It would be neat to have the same calling syntax, but be able to generate different code based on the parameters being literal types or not.
you can simply overload your foo() function as:
void foo(int&& a);
So when you call the function with a literal, e.g. foo(3), the compiler knows you need the above overload, as 3 is a rvalue. If you call the function as foo(a), the compiler will pick up your original version foo(const int& a); as int a=2; is a lvalue.
And this gives you the same calling syntax.
In the general case you couldn't get foo(3) evaluated at compile time. What if foo(x) was defined as add x days to the current date - and you first run the program next Tuesday? If it really is a constant then use a symbolic constant. If it is a simple function you could try a define (which will be replaced at compile time with the implementation -but it still will be evaluated at runtime)
e.g.
#define MIN(x,y) ((x)<(y)?(x):(y))

why would I call a function with the function name wrapped in parens?

I recently came across code that looked like this:
void function(int a, int b, int c){
//...
}
int main(){
//...
(function)(1,2,3);
//...
}
What is the point of wrapping the function name separately in parens?
Does it have any affect that would be different than function(1,2,3);?
Why does the language allow such syntax?
The only case I can think of where it would matter is when function is defined as a macro.
In C, standard library functions may also be implemented as function-like macros (for efficiency). Enclosing the function name in parentheses calls the actual function (since the function name is not followed by a ().
As for why the language allows the syntax, a function call consists of an expression of pointer-to-function type followed by the arguments in parentheses. In most cases, the prefix is a function name (which is implicitly converted to a pointer to the function), but it can be an arbitrary expression. Any expression may be enclosed in parentheses, usually without changing its meaning (other than affecting precedence). (But see Jonathan Leffler's comments for some counterexamples.)
In addition to suppressing function-like macro expansions, wrapping an unqualified function name in parentheses suppresses argument-dependent lookup. For example:
namespace meow {
struct kitty {};
void purr(kitty) {}
}
int main() {
meow::kitty stl;
purr(stl); // OK, ADL finds meow::purr
(purr)(stl); // error; no ADL is performed
}