What does the following line do?
#line 25 "CSSGrammar.y"
And what's with the extension?
According to the Standard:
§16.4.3:
A preprocessing directive of the form
# line digit-sequence new-line
causes the implementation to behave as if the following sequence of source lines begins with a source line
that has a line number as specified by the digit sequence (interpreted as a decimal integer). If the digit
sequence specifies zero or a number greater than 2147483647, the behavior is undefined.
§16.4.4:
A preprocessing directive of the form
# line digit-sequence " s-char-sequenceopt" new-line
sets the presumed line number similarly and changes the presumed name of the source file to be the contents
of the character string literal.
§16.4.5:
A preprocessing directive of the form
# line pp-tokens new-line
(that does not match one of the two previous forms) is permitted. The preprocessing tokens after line
on the directive are processed just as in normal text (each identifier currently defined as a macro name is
replaced by its replacement list of preprocessing tokens). If the directive resulting after all replacements
does not match one of the two previous forms, the behavior is undefined; otherwise, the result is processed
as appropriate.
The .y extension is just what the author chose to use, perhaps to make it apparent that it was a YACC file (the word "grammar" also points to that though it's just a guess).
It simply states that the current line of code is sourced from line 25 of CSSGrammar.y, a YACC-style grammar file which is where this code was generated.
This can be used by debuggers to step into the grammar itself as opposed to the generated code.
#line directive modifies the reporting position for the compiler, and is used by code generating software to help the programmer identify the issue in the original source. It can be used by anyone to help redirect error reporting to be more informative.
So for instance your parser generates a CSSGrammar.cpp file say, which is compiled by the c++ compiler, and has c++ snippets in it, a #line 25 "CSSGrammar.y" directive tells the c++ compiler to treat that particular point in the file as if it is line number 25 from CSSGrammar.y
The compiler will continue to parse subsequent lines and report errors under the initial conditions of that directive.
So if an error occurs 3 lines later it would report that an error occurred on line 28 in CSSGrammar.y
Note that a single source file can have sources coming in from multiple parts; and that this directive can be used quite effectively to indicate error conditions.
Typically you'll see that there are multiple #line directives along the way; they are just there to account for various injections along the way (to reset the reporting caret if you will).
Note that #line directive can be used by ANY generator including your own, and is not limited to in anyway parser generators.
It a directive for the compiler to believe that the following line is the line number 25 in file CSSGrammar.y. Then, if an error is detected by the compiler on the 2nd next line, it would be reported as coming from line 26 of CSSGrammar.y
Programs generating C files, like bison, or yacc, or flex, or ANTLR, or even the (obsolete) MELT use that possibility a lot.
If debugging information is generated (e.g. with gcc -g), it will point to the CSSGrammar.y file in your example.
The 'yacc' parser generator consumes files that end in .y, and emits files that contain c or c++. It adds these #line lines to allow a debugger to get back to ye olde original source, accept no substitutes.
it's a c preprocessor option. It tells the c-parser to drop it's line count of the source file an pretend, that this is line #25.
With this information it's easier for you to debug the the source file. The yacc file will be translated into a c-source, where this is the pretended source line.
Using #line forces the compiler to experience amnesia about what file it's compiling and what line it's on, and loads in the new data.
Note: The compiler still compiles from the line it was on.
Related
In my C++ project, I have a header with a line like this:
enum { OK, ERROR_1, ERROR_2 };
When compiling with GCC (v 9.4.0), I get
error: expected identifier before '(' token
Examining the preprocessor output gives
enum {
# 53 "/path/to/file.h" 3 4
(0)
# 53 "/path/to/file.h"
, ERROR_1, ERROR_2 };
I searched my project for a macro that would define OK and replace it with (0) but to no avail. So my question is how can I track where this (0) comes from? I read the docs on preprocessor output, but haven't found anything that would aid me in my problem.
You can use for example -E -fdirectives-only as options to GCC. It will give you a preprocessor output with all #includes resolved and including file/line markers, but with the macro definitions still in place and unexpanded.
Then simply search for #define OK in the output and search upwards for a # N marker where N is an integer. The marker will refer to the file/line from where the definition originates.
(By the way, you are looking on the wrong page of documentation. For the possible command line options affecting the preprocessor see https://gcc.gnu.org/onlinedocs/gcc/Preprocessor-Options.html.)
In my parser generated by flex, I would like to be able to store each line in the file, so that when reporting errors, I can show the user the line that the error occurred on.
I could of course do this using a vector and read in all lines from the file before/after lexing, but this would just add to the time needed to parse a file.
What I thought I could instead do, is to store the line whenever a new-line character is matched, and insert the current line into a vector. So my questions is, is there a variable/macro that flex that stores the current line inside? (Something like yyline perhaps)
Note: I am also using bison
By itself, lex/flex does not do what you ask. As noted, you want this for reporting error messages. (I do something like this in vi like emacs).
With lex/flex, the only way to store the entire line is to record each token from the current line into your own line-buffer. That can be complicated, especially if your lexer has to handle multi-line content (such as comments or strings).
The yytext variable only shows you the most recently parsed token (and yylength, the corresponding length). If your lexer does a simple ECHO, that is a token just like the ones you pay attention to.
Reading the file in advance as noted is one way to simplify the problem. In vi like emacs, the lexers read via a function from the in-memory buffer rather than from an input stream. It bypasses the normal stream-handling logic by redefining the YY_INPUT macro, e.g.,
#define YY_INPUT(buf,result,max_size) result = flt_input(buf,max_size)
Likewise, ECHO is redefined (since the editor reads the results back rather than letting them go to the standard output):
#define ECHO flt_echo(yytext, yyleng)
and it traps errors detected by the lexer with another redefinition:
#define YY_FATAL_ERROR(msg) flt_failed(msg);
However you do this, the yylineno value reported for a given token will be at the end of parsing a given token.
While it is nice to report the entire line in context in an error message, it is also useful to track the line and column number of each token -- various editors can deal with lines like this
filename:line:col:message
If you build up your line-buffer by tracking tokens, it might be relatively simple to track the column on which each token begins as well.
I came across this in a source code:
#define DEFAULT_PATHNAME "#SDK_DEFAULT_PATHNAME#"
what does the # symbol denotes in this case ?
Edit:
Camke was used to generate this project.
This value is used as a path to a file
CMake has this wonderful command configure_file which allows your build system to generate a file used in the build where the content (i.e. value) of the variable SDK_DEFAULT_PATHNAME will be put in the location of #SDK_DEFAULT_PATHNAME# in the "configured file".
In this case it's part of the string, nothing special.
On Windows for example, you could have the following string:
#define DEFAULT_PATHNAME "%PATH_TO_SDK%"
with the % character playing the same role. In C++ and in strings in general, it has no meaning (unlike \ which is used to escape characters).
EDIT:
To clarify, esp. with regards to your comment:
that value is used as a path to a file for the program to open, when removing the # the program broke
The operating system may need to read this character, as I mentioned it with the % example on Windows, to consider the path as something to look up in the environment variables for example. Once again, it has no special meaning in C++ or strings in general, but may have for other programs.
This particular C++ code project has 0xFF byte markers that prefix function definitions.
What is the purpose of this? Is it to aid some simple source file parser?
Apparently the compiler ignores these markers.
That could be Form Feed (ASCII 12) (on wiki-pedia), in other words a whitespace character.
The form feed character is sometimes used in plain text files of source code as a delimiter for a page break, or as marker for sections of code. Some editors, in particular emacs, have built-in commands to page up/down on the form feed character. This convention is predominantly used in Lisp code, and is also seen in Python source code.
It used to be common in sources back when source code was commonly printed on paper for review/archival.
Prints will interpret FF in plain text documents as a 'page break'
Semi-relevant: https://twitter.com/sehetw/status/297904888321544192
When compiling a .cpp file from Emacs through M-x compile (which runs the folder's Makefile), I see the following on the compilation buffer (displayed in compilation mode):
In file included from: /path/to/file1:60,
from /path/to/file2.h:15,
from /path/to/file3.cpp:16:
/path/to/file4.h:28:2: #warning This file includes at least one deprecated or antiquated header which may be removed without
further notice at a future date. Please use a non-deprecated interface
with equivalent functionality instead. For a listing of replacement
headers and interfaces, consult the file backward_warning.h. To
disable this warning use -Wno-deprecated.
Aside from the actual warning message, how should I understand this trace? i.e. which file generated the warning? (file1,file2, file3 or file4)?
Also, why is there a comma after the file2 line, a colon after the file3 line, the line with file4 includes two numbers separated with two colons?
I am using Emacs 24.2.1, with gcc-4.4.5-x86_64.
The construct that actually triggered the warning (a #warning preprocessor directive, in this case) is in file4. The stuff above that is a trace of the #include stack, innermost-but-one first: in this case, file3 included file2, which included file1, which included file4.
When gcc knows the column number of the construct that triggered a diagnostic, it prints the file name, a colon, the line number, another colon, and the column number, as you see on the file4 line. The first number is the line number (28) and the second number is the column number (in this case, you will find that the # of #warning is in column 2). When gcc doesn't know the column number, it just prints the file name, a colon, and the line number. This is the case for the #include stack, as it does not bother recording the exact column of #include directives. Emacs' compilation-mode understands how to parse both these syntaxes: you will find that if you use C-x ` to page through the diagnostics, when there is a column number available, Emacs will place the cursor at the appropriate column.
The colons and commas at the ends of these reports are just to conform to English punctuation convention; they don't mean anything.
The warning was generated in file4.h, on line 28.
The comma is because you're in the middle of a list, the colon designates the end of the list. The two numbers are line number and column number.
Actually show its compilation path saying that:
in column 2 of line 28 of file4.h
that included from file1.h(line 60)
that included from file2.h(line 15)
that included from file3.cpp(line 16)
there was a warning ...
Every compiler should keep this track and it has nothing to do with GCC is smart or something!!
Since your compiler only compile file3.cpp and every other file will only parsed as a result of inclusion from this file.