Why is clang parsing this as a user-defined literal? - c++

I have some code I am maintaining that I've started compiling under clang 3.3.
When compiling with "-std=c++11", clang generates an error (given below). I've distilled the offending code to the following:
#include <stdio.h>
#define DBG_PRT(__format, ...) \
printf("%s:%d:%s: "__format, __FILE__, \
__LINE__, __FUNCTION__, ## __VA_ARGS__)
int main()
{
DBG_PRT("%s\n", "Hi");
}
This is clang's output:
test.cpp:10:5: error: no matching literal operator for call to
'operator "" __format' with arguments of types 'const char *' and
'unsigned int'
DBG_PRT("%s\n", "Hi");
^ test.cpp:4:29: note: expanded from macro 'DBG_PRT'
printf("%s:%d:%s: "__format, __FILE__, \
^ 1 error generated.
Without spaces between the string literal and "__format", it doesn't seem like the preprocessor should be able to expand __format. It clearly is, though, when not specifying -std=c++11. G++ 4.4.7 (with and without -std=c++0x) compiles just fine.
Is there an error with the compiler?

This is because ""_ is a syntax for user-defined string literals. Put a space in between to have the old behavior (concatenate literals). GCC works fine because 4.4.7 does not implement user defined literals (it appeared in version 4.7).
Also, as #Fred have pointed out, try to avoid using reserved identifier (double underscore).

Related

c++ complains about __VA_ARGS__

The following code has been compiled with gcc-5.4.0 with no issues:
% gcc -W -Wall a.c
...
#include <stdio.h>
#include <stdarg.h>
static int debug_flag;
static void debug(const char *fmt, ...)
{
va_list ap;
va_start(ap, fmt);
vfprintf(stderr, fmt, ap);
va_end(ap);
}
#define DEBUG(...) \
do { \
if (debug_flag) { \
debug("DEBUG:"__VA_ARGS__); \
} \
} while(0)
int main(void)
{
int dummy = 10;
debug_flag = 1;
DEBUG("debug msg dummy=%d\n", dummy);
return 0;
}
However compiling this with g++ has interesting effects:
% g++ -W -Wall -std=c++11 a.c
a.c: In function ‘int main()’:
a.c:18:10: error: unable to find string literal operator ‘operator""__VA_ARGS__’ with ‘const char [8]’, ‘long unsigned int’ arguments
debug("DEBUG: "__VA_ARGS__); \
% g++ -W -Wall -std=c++0x
<same error>
% g++ -W -Wall -std=c++03
<no errors>
Changing debug("DEBUG:"__VA_ARGS__); to debug("DEBUG:" __VA_ARGS__); i.e. space before __VA_ARGS__ enables to compile with all three -std= options.
What is the reason for such behaviour? Thanks.
Since C++11 there is support for user-defined literals, which are literals, including string literals, immediately (without whitespace) followed by an identifier. A user-defined literal is considered a single preprocessor token. See https://en.cppreference.com/w/cpp/language/user_literal for details on their purpose.
Therefore "DEBUG:"__VA_ARGS__ is a single preprocessor token and it has no special meaning in a macro definition. The correct behavior is to simply place it unchanged into the macro expansion, where it then fails to compile as no user-defined literal operator for a __VA_ARG__ suffix was declared.
So GCC is correct to reject it as C++11 code.
This is one of the backwards-incompatible changes between C++03 and C++11 listed in the appendix of the C++11 standard draft N3337: https://timsong-cpp.github.io/cppwp/n3337/diff.cpp03.lex
Before C++11 the string literal (up to the closing ") would be its own preprocessor token and the following identifier a second preprocessor token, even without whitespace between them.
So GCC is also correct to accept it in C++03 mode. (-std=c++0x is the same as -std=c++11, C++0x was the placeholder name for C++11 when it was still in drafting)
It is also an incompatibility with C (in all revisions up to now) since C doesn't support user-defined literals either and considers the two parts of "DEBUG:"__VA_ARGS__ as two preprocessor tokens as well.
Therefore it is correct for GCC to accept it as C code as well (which is how the gcc command interprets .c files in contrast to g++ which treats them as C++).
To fix this add a whitespace between "DEBUG:" and __VA_ARGS__ as you suggested. That should make it compatible with all C and C++ revisions.

[[maybe_unused]] and Constructors

Trying to compile the sqlpp17 codebase with gcc 8.2.1 and clang 6.0.1 have been a really strange experience. The code pushes the compilers to the limits and I hit probably a few compiler bugs in the meantime.
From the GCC Docs, [[maybe_unused]] is implemented since version 7, but if used this way:
struct foo {
foo([[maybe_unused]] bool thing1)
{
}
};
I hit this specific error:
<source>:2:9: error: expected unqualified-id before '[' token
foo([[maybe_unused]] bool thing1)
^
<source>:2:9: error: expected ')' before '[' token
foo([[maybe_unused]] bool thing1)
~^
)
Compiler returned: 1
Now, I know too little about C++17 to know if this error is correct, I know that clang 6 compiles that part fine (and fails somewhere else).
So, who's right, clang or gcc? (flags are -std=gnu++17 for both clang and gcc, generated by CMake)
This is a known bug in g++: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81429 G++ doesn't parse correctly [[maybe_unused]] attribute for first argument of the constructor.

Missing an error when incorrectly passing string to printf-style log function

It is very common for any medium-to-large project to replace printf with a custom log function. Here is a minimal C++ example and its usage:
#include <stdio.h>
#include <stdarg.h>
#include <string>
void log_printf(const char* fmt, ...) {
va_list ap;
va_start(ap, fmt);
vprintf(fmt, ap); // real code obviously does something fancier
va_end(ap);
}
int main() {
std::string x = "Hello";
// correct code
printf("String is %s\n", x.c_str());
log_printf("String is %s\n", x.c_str());
// incorrect code
printf("String is %s\n", x); // bad line 1
log_printf("String is %s\n", x); // bad line 2
}
The simple logger receives a variable amount of arguments and calls vprintf to output them to standard output. The lines under 'correct code' demonstrate correct usage of this logger. My question involves the 'bad' lines, where a string object is incorrectly passed instead of a pointer to the character buffer.
Under GCC 4.6 (tested under Linux) neither of the bad lines can compile, which is a good thing because I want to catch such incorrect usage. The error is:
error: cannot pass objects of non-trivially-copyable type ‘std::string {aka struct std::basic_string<char>}’ through ‘...’
However in GCC 5.1 it has apparently become possible to pass non-trivially-copyable objects, and the compilation succeeds. If I use -Wall then only 'bad line 1' raises a warning about an unexpected argument type, but 'bad line 2' with the log_printf compiles without issue in any case. Needless to say both lines produce garbage output.
I can catch 'bad line 1' with -Wall -Werror, but what about 'bad line 2'? How can I cause it to also generate a compilation error?
For your own functions you need to use a common function attribute call format:
void log_printf(const char* fmt, ...) __attribute__((format (printf, 1, 2)));
void log_printf(const char* fmt, ...) {
...
}
Note that the attribute must be set on a function declaration, not the definition.
The first argument to the format attribute is the style, in this case printf (scanf is also possible, for functions that works like scanf), the second argument is the format string, and the third argument is where the ellipsis ... is. It will help GCC check the format strings like for the standard printf function.
This is a GCC extension, though some other compilers have adopted it to become GCC compatible, most notably the Intel C compiler ICC and Clang (the standard compiler used on OSX and some BSD variants). The Visual Studio compiler does not have this extension, and I know of no similar thing for Visual C++.

How does GCC process quotes in macro?

I'm trying to test gcc preprocessor for its macro expansion.
I write following code: (just for testing)
#include <stdio.h>
#define QUOTE "
#define TMPL hello
int main(){
printf(QUOTE TMPL QUOTE);
return 0;
}
the compiling result is:
$ gcc main.c -o main
main.c:3:15: warning: missing terminating " character
main.c: In function ‘main’:
main.c:7: error: missing terminating " character
main.c:7: error: missing terminating " character
main.c:7: error: ‘hello’ undeclared (first use in this function)
main.c:7: error: (Each undeclared identifier is reported only once
main.c:7: error: for each function it appears in.)
main.c:7: warning: format not a string literal and no format arguments
main.c:7: warning: format not a string literal and no format arguments
$
Then I try to have a look at the preprocessed result
$ gcc -E main.c -o tmp.c
main.c:3:15: warning: missing terminating " character
$
Though giving a warning, it somehow produces correct preprocessed code in tmp.c
int main(){
printf(" hello ");
return 0;
}
And I compiler tmp.c, hello is correctly printed.
I'm wondering why gcc -E could produce correct code, while using gcc compiling directly failed. Is there difference between the two method of gcc preprocessor?
$ gcc --version
i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00)
As I commented, a preprocessor macro should expand into a sequence of lexer tokens. Withing GCC source code, the libcpp (in charge of preprocessing and tokenizing) is producing a stream of tokens (not plain chars). A recent GCC 4.8, when run as gcc -Wall endyul.c -o endyl on your example, gives quite helpful diagnostics:
endyul.c:3:15: warning: missing terminating " character [enabled by default]
#define QUOTE "
^
endyul.c: In function 'main':
endyul.c:7:5: error: missing terminating " character
printf(QUOTE TMPL QUOTE);
^
endyul.c:7:5: error: missing terminating " character
endyul.c:4:14: error: 'hello' undeclared (first use in this function)
#define TMPL hello
^
endyul.c:7:18: note: in expansion of macro 'TMPL'
printf(QUOTE TMPL QUOTE);
^
endyul.c:4:14: note: each undeclared identifier is reported only once for each function it appears in
#define TMPL hello
^
endyul.c:7:18: note: in expansion of macro 'TMPL'
printf(QUOTE TMPL QUOTE);
^
Your GCC 4.2 is very old. You should consider upgrading it.
And clang (3.3) gives also a good diagnostic:
clang -Wall endyul.c -o endyul
endyul.c:3:15: warning: missing terminating '"' character [-Winvalid-pp-token]
#define QUOTE "
^
endyul.c:7:12: error: expected expression
printf(QUOTE TMPL QUOTE);
^
endyul.c:3:15: note: expanded from macro 'QUOTE'
1 warning and 1 error generated.
Read again the CPP manual of GCC, notably the chapters on Stringification and Concatenation.

%llx format specifier: invalid warning?

Edited to remove the first warning
The following code works as expected in g++ 4.4.0 under mingw32:
#include <cstdio>
int main()
{
long long x = 0xdeadbeefc0defaceLL ;
printf ("%llx\n", x) ;
}
But if I enable all warnings with -Wall, it says:
f.cpp: In function 'int main()':
f.cpp:5: warning: unknown conversion type character 'l' in format
f.cpp:5: warning: too many arguments for format
It's the same with %lld. Is this fixed in newer versions?
Edited again to add:
The warning doesn't go away if I specify -std=c++0x, even though (i) long long is a standard type, and (ii) %lld and %llx seem to be officially supported. For instance, from 21.5 Numeric conversions para 7:
Each function returns a string object holding the character representation of the value of
its argument that would be generated by calling sprintf(buf, fmt, val) with a format specifier of
"%d", "%u", "%ld", "%lu", "%lld", "%llu", "%f", "%f", or "%Lf", respectively, where buf designates
an internal character buffer of sufficient size.
So this is a bug, surely?
long long x = 0xdeadbeefc0defaceLL; // note LL in the end
And there is no ll length specifier for printf. The best you can get is:
printf ("%lx\n", x); // l is for long int
I've tested your sample on my g++, it compiles without errors even without -std=c++0x flag:
~$ g++ -Wall test.cpp
~$ g++ --version
g++ (Ubuntu 4.4.3-4ubuntu5) 4.4.3
So, yes, this fixed in newer versions.
For first warning I can say that you must use 0xdeadbeefc0defaceLL instead of 0xdeadbeefc0deface. After that other warnings may pass also.
I get the same warning compiling C using windows/mingw32.
warning: unknown conversion type character 'l' in format
So yes, probably a compiler/platform specific bug.
It's a Mingw-specific issue, because it calls the native Windows runtime for certain things, including this. See this answer.
%I64d works for me. In the answer linked above there is a more portable albeit less readable solution as well.