Macros and multi-line comments - c++

I need to have multi-line comments within a group of macros so that one of the macros initiates a comment block and another concludes it, like this:
#define C_BEGIN /*
#define C_END */
... other macros
But sure enough, this approach doesn't work.

You can't do it for the following reasoning. Let's assume it is possible.
So you created a macro that replaces itself with /*, and another for */. What happens then? First, the comments are removed from the code. After that, the preprocessor replaces your macros with the comments. After that, the compiler will choke: it doesn't know what to do with /* and */ because it simply never faces such things: the comments are always delete before the compilation, so it doesn't even know what a "comment" is. It will probably think it's a division followed by multiplication.
So our assumption is wrong and you can't do it.

Comment processing happens before macro expansion:
c++11
2.2 Phases of translation [lex.phases]
1 - The precedence among the syntax rules of translation is specified by the following phases. [...]
3. [...] Each comment is replaced by one space character. [...]
4. Preprocessing directives are executed, macro invocations are expanded [...]
Perhaps you could try preprocessing your source file twice? (Note: don't do this.)

But sure enough, this approach doesn't work.
It can't work. The comment in your #define C_BEGIN is not a part of and cannot be a part of your macro definition. As far as the language is concerned, your #define C_END is not a macro definition. It just a part of that multiline comment. In other words, it is whitespace. Comments are processed (turned into whitespace) before the preprocessor / compiler gets to the stage of interpreting your macro definitions.

If you are using an IDE, you can simply press ctrl/ for Windows or command/ on Mac. You should select the lines you want to comment first.

Related

Flex C++ - #ifdef inside flex block

I want to define constant in preprocessor which launches matching some patterns only when it's defined. Is it possible to do this, or there is the other way how to deal with this problem?
i.e. simplified version of removing one-line comments in C:
%{
#define COMMENT
%}
%%
#ifdef COMMENT
[\/][\/].*$ ;
#endif
[1-9][0-9]* printf("It's a number, and it works with and without defining COMMENT");
%%
There is no great solution to this (very reasonable) request, but there are some possibilities.
(F)lex start conditions
Flex start conditions make it reasonably simple to define a few optional patterns, but they don't compose well. This solution will work best if you have only a single controlling variable, since you will have ti define a separate start condition for every possible combination of controlling variables.
For example:
%s NO_COMMENTS
%%
<NO_COMMENTS>"//".* ; /* Ignore comments in `NO_COMMENTS mode. */
The %s declaration means that all unmarked rules also apply to the N_COMMENTS state; you will commonly see %x ("exclusive") in examples, but that would force you to explicitly mark almost every rule.
Once you have modified you grammar in this way, you can select the appropriate set of rules at run-time by setting the lexer's state with BEGIN(INITIAL) or BEGIN(NO_COMMENTS). (The BEGIN macro is only defined in the flex generated file, so you will want to export a function which performs one of these two actions.)
Using cpp as a utility.
There is no preprocessor feature in flex. It's possible that you could use a C preprocessor to preprocess your flex file before passing it to flex, but you will have to be very careful with your input file:
The C preprocessor expects its input to be a sequence of valid C preprocessor tokens. Many common flex patterns will not match this assumption, because of the very different quoting rules. (For a simple example, a common pattern to recognise C comments includes the character class [^/*] which will be interpreted by the C preprocessor as containing the start of a C comment.)
The flex input file is likely to have a number of lines which are valid #include directives. There is no way to avoid these directives from being expanded (other than removing them from the file). Once expanded and incorporated into the source, the header files no longer have include guards, so you will have to tell flex not to insert any #include files from its own templates. I believe that is possible, but it will be a bit fragile.
The C preprocessor may expand what looks to it like a macro invocation.
The C preprocessor might not preserve linear whitespace, altering the meaning of the flex scanner definition.
m4 and other preprocessors
It would be safer to use m4 as a preprocessor, but of course that means learning m4. ( You shouldn't need to install it because flex already depends on it. So if you have flex you also have m4.) And you will still need to be very careful with quoting sequences. M4 lets you customize these sequences, so it is more manageable than cpp. But don't copy the common idiom of defining [[ as a quote delimiter; it is very common inside regular expressions.
Also, m4 does not insert #line directives and any non-trivial use will change the number of input lines, making error messages harder to interpret. (To say nothing of the challenge of debugging.) You can probably avoid this issue in this very simple case but the issue will reappear.
You could also write your own simple preprocessor, but you will still need to address the above issues.

How do I use a preprocessor macro inside an include?

I am trying to build freetype2 using my own build system (I do not want to use Jam, and I am prepared to put the time into figuring it out). I found something odd in the headers. Freetype defines macros like this:
#define FT_CID_H <freetype/ftcid.h>
and then uses them later like this:
#include FT_CID_H
I didn't think that this was possible, and indeed Clang 3.9.1 complains:
error: expected "FILENAME" or <FILENAME>
#include FT_CID_H
What is the rationale behind these macros?
Is this valid C/C++?
How can I convince Clang to parse these headers?
This is related to How to use a macro in an #include directive? but different because the question here is about compiling freetype, not writing new code.
I will address your three questions out of order.
Question 2
Is this valid C/C++?
Yes, this is indeed valid. Macro expansion can be used to produce the final version of a #include directive. Quoting C++14 (N4140) [cpp.include] 16.2/4:
A preprocessing directive of the form
# include pp-tokens new-line
(that does not match one of the two previous forms) is permitted. The preprocessing tokens after include
in the directive are processed just as in normal text (i.e., each identifier currently defined as a macro name is
replaced by its replacement list of preprocessing tokens). If the directive resulting after all replacements does
not match one of the two previous forms, the behavior is undefined.
The "previous forms" mentioned are #include "..." and #include <...>. So yes, it is legal to use a macro which expands to the header/file to include.
Question 1
What is the rationale behind these macros?
I have no idea, as I've never used the freetype2 library. That would be a question best answered by its support channels or community.
Question 3
How can I convince Clang to parse these headers?
Since this is legal C++, you shouldn't have to do anything. Indeed, user #Fanael has demonstrated that Clang is capable of parsing such code. There must be some problem other problem in your setup or something else you haven't shown.
Is this valid C/C++?
The usage is valid C, provided that the macro definition is in scope at the point where the #include directive appears. Specifically, paragraph 6.10.2/4 of C11 says
A preprocessing directive of the form
# include pp-tokens new-line
(that does not match one of the two previous forms) is permitted. The
preprocessing tokens after include in the directive are processed just
as in normal text. (Each identifier currently defined as a macro name
is replaced by its replacement list of preprocessing tokens.) The
directive resulting after all replacements shall match one of the two
previous forms.
(Emphasis added.) Inasmuch as the preprocessor has the same semantics in C++ as in C, to the best of my knowledge, the usage is also valid in C++.
What is the rationale behind these macros?
I presume it is intended to provide for indirection of the header name or location (by providing alternative definitions of the macro).
How can I convince Clang to parse these headers?
Provided, again, that the macro definition is in scope at the point where the #include directive appears, you shouldn't have to do anything. If indeed it is, then Clang is buggy in this regard. In that case, after filing a bug report (if this issue is not already known), you probably need to expand the troublesome macro references manually.
But before you do that, be sure that the macro definitions really are in scope. In particular, they may be guarded by conditional compilation directives -- in that case, the best course of action would probably be to provide whatever macro definition is needed (via the compiler command line) to satisfy the condition. If you are expected to do this manually, then surely the build documentation discusses it. Read the build instructions.

Why do parens prevent macro substitution?

While researching solutions to the windows min/max macro problem, I found an answer that I really like but I do not understand why it works. Is there something within the C++ specification that says that macro substitution doesn't occur within parens? If so where is that? Is this just a side effect of something else or is the language designed to work that way? If I use extra parens the max macro doesn't cause a problem:
(std::numeric_limits<int>::max)()
I'm working in a large scale MFC project, and there are some windows libraries that use those macros so I'd prefer not to use the #undef trick.
My other question is this. Does #undef max within a .cpp file only affect the file that it is used within, or would it undefine max for other compilation units?
Function-like macros only expand when the next thing after is an opening parenthesis. When surrounding the name with parentheses, the next thing after the name is a closing parenthesis, so no expansion occurs.
From C++11 § 16.3 [cpp.replace]/10:
Each subsequent instance of the function-like macro name followed by a ( as the next preprocessing token introduces the sequence of preprocessing tokens that is replaced by the replacement list in the definition (an invocation of the macro).
To answer the other question, preprocessing happens before normal compilation and linking, so doing an #undef in an implementation file will only affect that file. In a header, it affects every file that includes that header.

At which point during preprocessing stage is __LINE__ expanded?

I have a function in a header file:
template <int line>
inline void log() {}
And then I try this trick to make using it easier:
#define LOG_LINE log<__LINE__>()
And then in a .cpp file I do:
void main()
{
LOG_LINE;
}
And it seems that it works the way I'd like it to. I get the line from .cpp file, not the line at which LOG_LINE is declared in .h file. But I don't understand how it works. Does C++ perfrom double-pass preprocessing, leaving special macros like __LINE__ for second pass? Is this portable (standard) behavior? Should I expect this to work with all major C++ compilers? So far I've only tried MSVC.
One should distinguish between the number of passes through the entire input, which is what terms like single-pass normally refer to, and handling of nested expansions. The preprocessor normally expands all macros in a single pass through the file, but it correctly expands the expanded forms, until there is nothing left to expand.
That is, LOG_LINE gets expanded to log<__LINE__>(), where __LINE__ again gets expanded to 3, producing the final expansion log<3>() — all in a single pass through the compilation unit.
Yes, this is portable, standard behaviour. Yes you can depend on it.
The reason is that the #define command simply stores the tokens that constitute the expansion text without interpreting it or expanding it at all. Because they are tokens, white space and comments are not stored.
Then when the macro name is used in the program text, it is replaced by the expansion text (and any arguments as needed). Then any tokens in the substituted text are scanned and replaced, and so on (except there is no recursive replacement).
In your case it takes two expansions to get to the underlying line number.
See N3797 16.3. It's relatively readable, for a standards document. There are examples quite close to what you're asking.
This does not need double pass preprocessing. This is about the order of expansion of nested macros (well, in a sense, an expression is expanded in as many passes as needed to expand all the nested macros). A quote from an answer to this question:
All arguments which don't appear after the # operator or either side of a ## are fully macro expanded when they are replaced, not before the function-like macro is expanded.

Multiple preprocessor directives on one line in C++

A hypothetical question: Is it possible to have a C++ program, which includes preprocessor directives, entirely on one line?
Such a line would look like this:
#define foo #ifdef foo #define bar #endif
What are the semantics of such a line?
Further, are there any combinations of directives which are impossible to construct on one line?
If this is compiler-specific then both VC++ and GCC answers are welcome.
A preprocessing directive must be terminated by a newline, so this is actually a single preprocessing directive that defines an object-like macro, named foo, that expands to the following token sequence:
# ifdef foo # define bar # endif
Any later use of the name foo in the source (until it is #undefed) will expand to this, but after the macro is expanded, the resulting tokens are not evaluated as a preprocessing directive.
This is not compiler-specific; this behavior is defined by the C and C++ standards.
Preprocessor directives are somewhat different than language statements, which are terminated by ; and use whitespace to delimit tokens. In the case of the preprocessor, the directive is terminated by a newline so it's impossible to do what you're attempting using the C++ language itself.
One way you could kind of simulate this is to put your desired lines into a separate header file and then #include it where you want. The separate header still has to have each directive on one line, but the point where you include it is just a single line, effectively doing what you asked.
Another way to accomplish something like that is to have a pre-C++ file that you use an external process to process into a C++ source file prior to compiling with your C++ compiler. This is probably rather more trouble than it's worth.