Reset the C/C++ preprocessor #line the physical file/line - c++

I have a code generator that's going to take some user-written code and embed chunks of it in a larger generated file. I want the underlying compiler to provide good diagnostics when there are defects in the user's code, but I also don't want defects in the generated code to be misattributed to the source when they shouldn't be.
I intend to emit #line lineNum "sourceFile" directives at the beginning of each chunk of user-written code. However, I can't find any documentation of the #line directive that mentions a technique for 'resetting' __LINE__ and __FILE__ back to the actual line in the generated file once I leave the user-provided code. The ideal solution would be analogous to the C# preprocessor's #line default directive.
Do I just need to keep track of how many lines I've written and manually reset that myself? Or is there a better way, some sort of reset directive or sentinel value I can pass to #line to erase the association with the user's code?
It looks like this may have been posed before, though there's no solid answer there. To distinguish this from that, I'll additionally ask whether the lack of answer there has changed with C++11.

A technique I've used before is to have my code generator output a # by itself on a line when it wants to reset the line directives, and then use a simple awk script to postprocess the file and change those to correct line directives:
#!/bin/awk -f
/^#$/ { printf "#line %d \"%s\"\n", NR+1, FILENAME; next; }
{ print; }

Yes, you need to keep track of the number of lines you've output, and you need to know the name of the file you're outputting into. Remember that the line number you specify is the line number of the next line. So if you've written 12 lines so far, you need to output #line 14 "filename", since the #line directive will go on line 13, and so the next line is 14.
There's no difference between the #line preprocessor directive in C and C++.

Suppose the input to the code generator, "user.code", contains the following:
int foo () {
return error1 ();
}
int bar () {
return error2 ();
}
Suppose you want to augment this so it looks basically look like this:
int foo () {
return error1 ();
}
int generated_foo () {
return generated_error1 ();
}
int bar () {
return error2 ();
}
int generated_bar () {
return generated_error2 ();
}
Except you don't want that. You want to add #line directives to the generated code so that the compiler messages indicate whether the errors / warnings are from the user code or the autogenerated code. The #line directive indicates the source of the next line of code (rather than the line containing the #line directive).
#line 1 "user.code"
int foo () {
return error1 ();
}
#line 7 "generated_code.cpp" // NOTE: This is line #6 of generated_code.cpp
int generated_foo () {
return generated_error1 ();
}
#line 5 "user.code"
int bar () {
return error2 ();
}
#line 17 "generated_code.cpp" // NOTE: This is line #16 of generated_code.cpp
int generated_bar () {
return generated_error2 ();
}

#Novelocrat,
I had asked this question here before, and no solid answers were posted, but I figured out that if line directives are inserted in the auto-generated code that points to the user code, then this makes the auto-generated code hard to relocate. You have to keep the auto-generated and user code in the locations where the compiler can find them for reporting errors. I thought it was better to simply insert the file name and line numbers of the user code in the generated code. In good text editors it is a matter of a couple of keystrokes to jump to a line in a file by placing the cursor on the file name.
Eg: in vim placing the cursor on the file-name and pressing g-f takes you to the file, and :42 takes you to the line 42 (say) that had the error.
Just posting this bit here, so that someone else coming up with the same questions might consider this alternative too.

Have you tried what __LINE__ and __FILE__ give you? I believe they are taken from your #line directives (what would be the point if not?).
(A quick test with gcc-4.7.2 and clang-3.1 confirms my hunch).

Related

C++ Static Code Analysis - Show header used in statement or line

I'm searching for a tool to get the used header (if there is one/more) for every line/statment in my c++ code.
Example:
#include<iostream>
std::cout << "hallo";
The output i'd like to see:
line 2: std::cout uses "iostream"
I found this question, the tools there do most of the part, they show dependency per file.
Does anyone know such a tool or how to acomplish this with the tools given in the answers in the question above?
Goal: I'm checking code for the conformity to a standard which i have a list of allowed headers for. With the desired output I can create a metric saying something like: 60% of the code is using allowed headers, 15% is using other headers or something like that.
This is not completely what you want but you can use Eclipse CDT to know where std::cout is declared.
If you press F3 when cout is selected in Eclipse, you will jump to this line of code inside iostream header file on the system with gcc 7:
extern ostream cout; /// Linked to standard output
You can try CppDepend to get all the methods called by a specific one with the location of each method called.

Recovering/changing function name automatically in ROOT (C++) by CERN

Following is an example of a code, where I define (line 1) the name of the function to be used later (line 4).
char *funcname = "addition"; //line 1
void addition(){ //line 2
//file=Form("%s.txt",funcname);
TFile *ofile = new TFile(Form("%s.root",funcname) ,"RECREATE");//line 4 //EDIT
}
Now my question is: how can I use a similar code like Form("%s",funcname) to state the name of the function directly in line 4 without requiring line 1 by recovering the function name somehow, or change the line 2 function name as in the example shown above?
For example, I was trying to alter my line 2 code:
void Form("%s",funcname)(){
}
but technically that would mean this:
void "addition"(){
}
and not this:
void addition(){
}
I do not want the quotation marks. So what's the solution?
EDIT:
The above lines of code will be in a file named addition.C and it generates a file named addition.root by running the command root addition.C in the terminal.
I am trying to get an output using the code at line 4 but with a different output file name every time I change the name of the function at line 2. This is so that I do not overwrite the output generated before when I ran the file with a different name giving me some output file.
Did I make the question clear enough? I thought this was a legitimate question and didn't expect so many downvotes in such a small time! Any suggestions to make more edits are welcome.
EDIT 2:
Changing function name dynamically is probably not possible as people suggested in comments. Then the solution should be to get the function name some way (automatically), someway like
TFile *ofile= new TFile(Form("%s.root", __func__), "RECREATE"); //line 4
but unfortunately that doesn't work.
In C there is the standard predefined identifier __func__ that holds the name of the current function.
file=Form("%s.txt", __func__); //line 4
should do the trick.
Since you also have tagged with C++, if I understand correctly since C++11 this will give you the mangled name of your function.
gcc has an extension __PRETTY_FUNCTION__ that gives you the demangled name together with the argument types.

put code executed by GDB

I have small question. Is it possible in C/C++ to put bit of codes that would interact a bit more with GDB ?
Let say I have a function like
void gdb_print(const char*);
This would print information in gdb while it executes! If it's not possible it would be awesome. Would be simple to track some information, and faster in some way!
I need something like this, because we're writing some plugins, and info from cout or cerr are not going to the console at all. So it would be something discrete. Also, could add some stuff like:
#ifdef __DEBUG__
#define debug_msg(x) gdb_print(x)
#else
#define debug_msg(x)
#endif
If it doesn't exist, let me know what you think about this!
I need something like this, because we're writing some plugins, and info from cout or cerr are not going to the console at all.
You can always write to console with:
FILE *console = fopen("/dev/tty", "w");
if (console != NULL) fprintf(console, "Your debug message\n");
I don't know of a method to write specifically to the terminal where GDB is running (which could well be different terminal from which the program itself was invoked).
try redirecting the stderr and stdout to a file using freopen. see this
This is a sample code to redirect stdout to a file in runtime:
/* freopen example: redirecting stdout */
#include <stdio.h>
int main ()
{
freopen ("myfile.txt","w",stdout);
printf ("This sentence is redirected to a file.");
fclose (stdout);
return 0;
}
static int gdb = 0;
void gdb_print(char const * msg) {
if(gdb) printf("\tGDB: %s\n", msg);
}
When you load your program up in gdb, set a breakpoint in main, then set gdb to a non-zero value. This isn't the cleanest solution (and certainly not automated) but I think it'll give you what you're looking for. Be sure to use the per-processor to remove the calls in non-debug builds (no sense in having all those extra compares that'll never evaluate to true).

How to strip C++ style single line comments (`// ...`)

For a small DSL I'm writing I'm looking for a regex to match a comment string at the end of the like the // syntax of C++.
The simple case:
someVariable = 12345; // assignment
Is trivial to match but the problem starts when I have a string in the same line:
someFunctionCall("Hello // world"); // call with a string
The // in the string should not match as a comment
EDIT - The thing that compiles the DSL is not mine. It's a black box as far as I'm which I don't want to change and it doesn't support comments. I just want to add a thin wrapper to make it support comments.
EDIT
Since you are effectively preprocessing a sourcefile, why don't you use an existing preprocessor? If the language is sufficiently similar to C/C++ (especially regarding quoting and string literals), you will be able to just use cpp -P:
echo 'int main() { char* sz="Hello//world"; /*profit*/ } // comment' | cpp -P
Output: int main() { char* sz="Hello//world"; }
Other ideas:
Use a proper lexer/parser instead
Have a look at
CoCo/R (available for Java, C++, C#, etc.)
ANTLR (idem)
Boost Spirit (with Spirit Lex to make it even easier to strip the comments)
All suites come with sample grammars that parse C, C++ or a subset thereof
shoosh wrote:
EDIT - The thing that compiles the DSL is not mine. It's a black box as far as I'm which I don't want to change and it doesn't support comments. I just want to add a thin wrapper to make it support comments.
In that case, create a very simple lexer that matches one of three tokens:
// ... comments
string literals: " ... "
or, if none of the above matches, match any single character
Now, while you iterate ov er these 3 different type of tokens, simply print tokens (2) and (3) to the stdout (or to a file) to get the uncommented version of your source file.
A demo with GNU Flex:
example input file, in.txt:
someVariable = 12345; // assignment
// only a comment
someFunctionCall("Hello // world"); // call with a string
someOtherFunctionCall("Hello // \" world"); // call with a string and
// an escaped quote
The lexer grammar file, demo.l:
%%
"//"[^\r\n]* { /* skip comments */ }
"\""([^"]|[\\].)*"\"" {printf("%s", yytext);}
. {printf("%s", yytext);}
%%
int main(int argc, char **argv)
{
while(yylex() != 0);
return 0;
}
And to run the demo, do:
flex demo.l
cc lex.yy.c -lfl
./a.out < in.txt
which will print the following to the console:
someVariable = 12345;
someFunctionCall("Hello // world");
someOtherFunctionCall("Hello // \" world");
EDIT
I'm not really familiar with C/C++, and just saw #sehe's recommendation of using a pre-processor. That looks to be a far better option than creating your own (small) lexer. But I think I'll leave this answer since it shows how to handle this kind of stuff if no pre-processor is available (for whatever reason: perhaps cpp doesn't recognise certain parts of the DSL?).

Parsing a C++ source file after preprocessing

I am trying to analyze c++ files using my custom made parser (written in c++). Before start parsing, I will like to get rid of all #define. I want the source file to be compilable after preprocessing. So best way will be to run C Preprocessor on the file.
cpp myfile.cpp temp.cpp
// or
g++ -E myfile.cpp > templ.cpp
[New suggestions are welcome.]
But due to this, the original lines and their line numbers will be lost as the file will contain all the header information also and I want to retain the line numbers. So the way out I have decided is,
Add a special symbol before
every line in the source file (except preprocessors)
Run the preprocessor
Extract the lines with that special
symbol and analyze them
For example, a typical source file will look like:
#include<iostream>
#include"xyz.h"
int x;
#define SOME value
/*
** This is a test file
*/
typedef char* cp;
void myFunc (int* i, ABC<int, X<double> > o)
{
//...
}
class B {
};
After adding symbol it will be like,
#include<iostream>
#include"xyz.h"
#3#int x;
#define SOME value
#5#/*
#6#** This is a test file
#7#*/
#8#typedef char* cp;
#9#
#10#void myFunc (int* i, ABC<int, X<double> > o)
#11#{
#12# //...
#13#}
#14#
#15#class B {
#16#};
Once all the macros and comments are removed, I will be left with thousands of line in which few hundred will be the original source code.
Is this approach correct ? Am I missing any corner case ?
You realize that g++ -E adds some of its own lines to its output which indicate line numbers in the original file? You'll find lines like
# 2 "foo.cc" 2
which indicate that you're looking at line 2 of file foo.cc . These lines are inserted whenever the regular sequence of lines is disrupted.
The imake program that used to come with X11 sources used a faintly similar system, marking the ends of lines with ## so that it could post-process them properly.
The output from gcc -E usually includes #line directives; you could perhaps use those instead of your symbols.