gcc for parsing code

gcc for parsing code - c++

I would like to know how to use GCC as a library to parse C/C++/Java/Objective C/Ada code for my program.
I want to bypass prepocessing and prefix all the functions that are user written with a prefix My.
like so Print(); becomes MyPrint(); I also wish to do this with the variables.

You can look here:
http://codesynthesis.com/~boris/blog/2010/05/03/parsing-cxx-with-gcc-plugin-part-1/
This is description of how to use gcc plugin interface to parse C++ code. Other language should be handled in the same manner.
Also you can try pork from mozilla:
https://wiki.mozilla.org/Pork
When I tried it (pork), I spend hour or so to fix compile problems, but then
I can write scripts like this:
rewrite SyncPrimitiveUpgrade {
type PRLock* => Mutex*
call PR_NewLock() => new Mutex()
call PR_Lock(lock) => lock->Lock()
call PR_Unlock(lock) => lock->Unlock()
call PR_DestroyLock(lock) => delete lock
}
so it found all type PRLock and replate it with Mutex, also it search call of functions
like PR_NewLock and replace it with "new Mutex".

You might wish to investigate the sparse C parser. It understands a lot of C (all the C used in the Linux kernel sources, which is a fairly good subset of legal ANSI-C and GNU-C extensions) and provides a few sample compiler backends to provide a lint-like static analysis tool for type checking.
While the code looks very clean and thorough, your task might be easier done via another mechanism -- the example.c included with the sparse source that demonstrates a compiler is 1955 lines long.

For C, you cannot do that reliably. If you skip preprocessing you will -- in general -- not have valid C code to be parsed. E.g.
#define FOO
#define BAR
#define BAZ
FOO void BAR qux BAZ(void) { }
How is the parser supposed to recognize this a function definition of qux without doing the preprocessing?

First, GCC is not a library, and is not structured to be one (in contrast to LLVM).
Why (i.e. what for) do you want to parse C, C++, Ada source code?
I would consider (assuming a GCC 4.6 version) extending GCC either thru plugins written in C, or preferably using MELT, a high level domain specific language to extend GCC (disclaimer: I am the main author of MELT).
But using GCC as a library is not realistic at all.
I really think that for what you want to achieve, MELT is the right tool. However, it is poorly documented. Please use the gcc-melt#googlegroups.com list to ask questions.
And be aware that extending GCC does take some amount of work (more than a week perhaps), because you need to partly understand the GCC internal representations.

Our DMS Software Reengineering Toolkit can parse C, C++, Java and Ada code (not Objective C at this time) in a wide variety of dialects and carry out transformations on the code. DMS's C and C++ front ends include a preprocessor, so you can you can cause preprocessing before you parse.
I'm probably don't understand what you want to do, because it seems strange to rename every function and (global?) variable with a "My...." prefix. But you could do that with some DMS rules (a rough sketch of renames of user functions for GCC3:
domain C~GCC3.
rule rewrite_function_names(t: type_designator, i: IDENTIFIER, p: parameter_list, s: statements):
function_header->functionheader
"\t \i(\p) { \s } " -> "\t \renamed\(\i\) (\p) { \s }" ;
and a helper function "renames" that takes a tree node containing an identifer, and returns a tree node with the renamed identifier.
Because DMS patterns only match against the parse trees, you won't get any false positives.
You'd need some additional patterns to handle various different syntax cases within each langauge (e.g, for C, "void" return type, because "void" isn't a type designator in the syntax, and global variable declarations), and different rules for different languages (Ada's syntax is not the same as that of C).
This might seem like big hammer for your task, but if you really insist on doing this for a variety of languages in a reliable way, it seems hard to avoid the problem of getting decent parsers for all those languages. (And if you are really going to do this for all these languages, DMS can be taught to handle ObjectiveC the same we we have taught it to handle the other langauges).
Your alternative is some kind of string hacking solution, which might work 95% of the time. If you can live with that, then Perl or something similar is likely your answer.

forget about GCC, its made as a compiler's parser, not an analysis parser, you'd do way better using something like libclang, a C interface to clang, which can process both C & C++

Related

Is it possible to create a user-defined datatype in a language like C/C++(or maybe any) from a string as user input or from file

Well this might be a very weird question but my curiosity has striken pretty hard on this. So here it goes...
NOTE: Lets take the language C into consideration here.
As programmers we usually define a user-defined datatype(say struct) in the source code with the appropriate name.
Suppose I have a program in which I have a structure defined as:
struct Animal {
char *name;
int lifeSpan;
};
And also I have started the execution of this program.
Now, my question here is;
What if I want to define a new structure called "Plant" just like "Animal" mentioned above in my program, without writing its definition in the source code itself(which is obviously impossible currently) but rather from a user input string(or a file input) during runtime.
Lets say my program takes input string from a text file named file1.txt whose content is:
struct Plant {
char *name;
int lifeSpan;
};
What I want now is to have a new structure named "Plant" in my program which is already in execution. The program should read the file content and create a structure as written in the file and attach it to itself on-the-go.
I have checked out a solution for C++ in the discussion Declaring a data type dynamically in C++ but it doesnt seem to have a very convincing solution.
The solution I am looking for is at the compiler-linker-loader level rather than from the language itself.I would be very pleased and thankful if anyone is looking forward to sharing their ideas on this.

What you're asking about is basically "can we implement C as a scripting language?", since this is the only way code can be executed after compilation.
I'm aware that people have been writing (mostly in the comments) that it's possible in other languages but isn't possible in C, since C is a compiled language (hence data types should be defined during compile time).
However, to the best of my knowledge it's actually possible (and might not be as hard as one would imagine).
There are many possible approaches (machine code emulation (VM), JIT compilation, etc').
One approach will use a C compiler to compile the C script as an external dynamic library (.dll on windows, .so on linux, etc') and than "load" the compiled library and execute the code (this is pretty much the JIT compilation approach, for lazy people).
EDIT:
As mentioned in the comments, by using this approach, the new type is loaded as part of an external library.
The original code won't know about this new type, only the new code (or library) will be "aware" of this new type and able to properly use it.
On the other hand, I'm not sure why you're insisting on the need to use static types and a compiler-linker-loader level solution.
The language itself (the C language) can manage this task dynamically (during execution time).
Consider Ruby MRI, for example. The Ruby language supports dynamic types that can be defined during runtime...
...However, this is implemented in C and it's possible to use the code from within C to define new modules and classes. These aren't static types that can be tested during compilation (type creation and identification is performed during runtime).
This is a perfect example showing that C (as a language) can dynamically define "types".
However, this is also a poor example because Ruby's approach is slow. A custom approved can be far faster since it would avoid the huge overhead related to functionality you might not need (such as inheritance).

How to modify C++ code from user-input

I am currently writing a program that sits on top of a C++ interpreter. The user inputs C++ commands at runtime, which are then passed into the interpreter. For certain patterns, I want to replace the command given with a modified form, so that I can provide additional functionality.
I want to replace anything of the form
A->Draw(B1, B2)
with
MyFunc(A, B1, B2).
My first thought was regular expressions, but that would be rather error-prone, as any of A, B1, or B2 could be arbitrary C++ expressions. As these expressions could themselves contain quoted strings or parentheses, it would be quite difficult to match all cases with a regular expression. In addition, there may be multiple, nested forms of this expression
My next thought was to call clang as a subprocess, use "-dump-ast" to get the abstract syntax tree, modify that, then rebuild it into a command to be passed to the C++ interpreter. However, this would require keeping track of any environment changes, such as include files and forward declarations, in order to give clang enough information to parse the expression. As the interpreter does not expose this information, this seems infeasible as well.
The third thought was to use the C++ interpreter's own internal parsing to convert to an abstract syntax tree, then build from there. However, this interpreter does not expose the ast in any way that I was able to find.
Are there any suggestions as to how to proceed, either along one of the stated routes, or along a different route entirely?

What you want is a Program Transformation System.
These are tools that generally let you express changes to source code, written in source level patterns that essentially say:
if you see *this*, replace it by *that*
but operating on Abstract Syntax Trees so the matching and replacement process is
far more trustworthy than what you get with string hacking.
Such tools have to have parsers for the source language of interest.
The source language being C++ makes this fairly difficult.
Clang sort of qualifies; after all it can parse C++. OP objects
it cannot do so without all the environment context. To the extent
that OP is typing (well-formed) program fragments (statements, etc,.)
into the interpreter, Clang may [I don't have much experience with it
myself] have trouble getting focused on what the fragment is (statement? expression? declaration? ...). Finally, Clang isn't really a PTS; its tree modification procedures are not source-to-source transforms. That matters for convenience but might not stop OP from using it; surface syntax rewrite rule are convenient but you can always substitute procedural tree hacking with more effort. When there are more than a few rules, this starts to matter a lot.
GCC with Melt sort of qualifies in the same way that Clang does.
I'm under the impression that Melt makes GCC at best a bit less
intolerable for this kind of work. YMMV.
Our DMS Software Reengineering Toolkit with its full C++14 [EDIT July 2018: C++17] front end absolutely qualifies. DMS has been used to carry out massive transformations
on large scale C++ code bases.
DMS can parse arbitrary (well-formed) fragments of C++ without being told in advance what the syntax category is, and return an AST of the proper grammar nonterminal type, using its pattern-parsing machinery. [You may end up with multiple parses, e.g. ambiguities, that you'll have decide how to resolve, see Why can't C++ be parsed with a LR(1) parser? for more discussion] It can do this without resorting to "the environment" if you are willing to live without macro expansion while parsing, and insist the preprocessor directives (they get parsed too) are nicely structured with respect to the code fragment (#if foo{#endif not allowed) but that's unlikely a real problem for interactively entered code fragments.
DMS then offers a complete procedural AST library for manipulating the parsed trees (search, inspect, modify, build, replace) and can then regenerate surface source code from the modified tree, giving OP text
to feed to the interpreter.
Where it shines in this case is OP can likely write most of his modifications directly as source-to-source syntax rules. For his
example, he can provide DMS with a rewrite rule (untested but pretty close to right):
rule replace_Draw(A:primary,B1:expression,B2:expression):
primary->primary
"\A->Draw(\B1, \B2)" -- pattern
rewrites to
"MyFunc(\A, \B1, \B2)"; -- replacement
and DMS will take any parsed AST containing the left hand side "...Draw..." pattern and replace that subtree with the right hand side, after substituting the matches for A, B1 and B2. The quote marks are metaquotes and are used to distinguish C++ text from rule-syntax text; the backslash is a metaescape used inside metaquotes to name metavariables. For more details of what you can say in the rule syntax, see DMS Rewrite Rules.
If OP provides a set of such rules, DMS can be asked to apply the entire set.
So I think this would work just fine for OP. It is a rather heavyweight mechanism to "add" to the package he wants to provide to a 3rd party; DMS and its C++ front end are hardly "small" programs. But then modern machines have lots of resources so I think its a question of how badly does OP need to do this.

Try modify the headers to supress the method, then compiling you'll find the errors and will be able to replace all core.
As far as you have a C++ interpreter (as CERN's Root) I guess you must use the compiler to intercept all the Draw, an easy and clean way to do that is declare in the headers the Draw method as private, using some defines
class ItemWithDrawMehtod
{
....
public:
#ifdef CATCHTHEMETHOD
private:
#endif
void Draw(A,B);
#ifdef CATCHTHEMETHOD
public:
#endif
....
};
Then compile as:
gcc -DCATCHTHEMETHOD=1 yourfilein.cpp

In case, user want to input complex algorithms to the application, what I suggest is to integrate a scripting language to the app. So that the user can write code [function/algorithm in defined way] so the app can execute it in the interpreter and get the final results. Ex: Python, Perl, JS, etc.
Since you need C++ in the interpreter http://chaiscript.com/ would be a suggestion.

What happens when someone gets ahold of the Draw member function (auto draw = &A::Draw;) and then starts using draw? Presumably you'd want the same improved Draw-functionality to be called in this case too. Thus I think we can conclude that what you really want is to replace the Draw member function with a function of your own.
Since it seems you are not in a position to modify the class containing Draw directly, a solution could be to derive your own class from A and override Draw in there. Then your problem reduces to having your users use your new improved class.
You may again consider the problem of automatically translating uses of class A to your new derived class, but this still seems pretty difficult without the help of a full C++ implementation. Perhaps there is a way to hide the old definition of A and present your replacement under that name instead, via clever use of header files, but I cannot determine whether that's the case from what you've told us.
Another possibility might be to use some dynamic linker hackery using LD_PRELOAD to replace the function Draw that gets called at runtime.

There may be a way to accomplish this mostly with regular expressions.
Since anything that appears after Draw( is already formatted correctly as parameters, you don't need to fully parse them for the purpose you have outlined.
Fundamentally, the part that matters is the "SYMBOL->Draw("
SYMBOL could be any expression that resolves to an object that overloads -> or to a pointer of a type that implements Draw(...). If you reduce this to two cases, you can short-cut the parsing.
For the first case, a simple regular expression that searches for any valid C++ symbol, something similar to "[A-Za-z_][A-Za-z0-9_\.]", along with the literal expression "->Draw(". This will give you the portion that must be rewritten, since the code following this part is already formatted as valid C++ parameters.
The second case is for complex expressions that return an overloaded object or pointer. This requires a bit more effort, but a short parsing routine to walk backward through just a complex expression can be written surprisingly easily, since you don't have to support blocks (blocks in C++ cannot return objects, since lambda definitions do not call the lambda themselves, and actual nested code blocks {...} can't return anything directly inline that would apply here). Note that if the expression doesn't end in ) then it has to be a valid symbol in this context, so if you find a ) just match nested ) with ( and extract the symbol preceding the nested SYMBOL(...(...)...)->Draw() pattern. This may be possible with regular expressions, but should be fairly easy in normal code as well.
As soon as you have the symbol or expression, the replacement is trivial, going from
SYMBOL->Draw(...
to
YourFunction(SYMBOL, ...
without having to deal with the additional parameters to Draw().
As an added benefit, chained function calls are parsed for free with this model, since you can recursively iterate over the code such as
A->Draw(B...)->Draw(C...)
The first iteration identifies the first A->Draw( and rewrites the whole statement as
YourFunction(A, B...)->Draw(C...)
which then identifies the second ->Draw with an expression "YourFunction(A, ...)->" preceding it, and rewrites it as
YourFunction(YourFunction(A, B...), C...)
where B... and C... are well-formed C++ parameters, including nested calls.
Without knowing the C++ version that your interpreter supports, or the kind of code you will be rewriting, I really can't provide any sample code that is likely to be worthwhile.

One way is to load user code as a DLL, (something like plugins,)
this way, you don't need to compile your actual application, just the user code will be compiled, and you application will load it dynamically.

How to implement Python function exec() in C++?

The Python function exec() is a very good tool for programming.
For example:
/******************************************************
* we have array names={"ali","hamed"}
* goal is declare string ali="ali" and hamed="hamed"
******************************************************/
Python code:
# used for command
# below line is a syntax python
# in python we dont declare variable
list=["ali","hamed"] #python syntax
#in python we dont wite {
#python work with tab insteed of {}
for i in list:
temp = i + ' = "' + i + '"' #temp is a string
exec(temp)
My question: Is there a similar function exists in C++? If not, how to implement it in C++?

Python's exec, and similar features in other languages (i.e. eval in JavaScript) only work because those are interpreted languages where variables are defined at run time.
You can't do what you're asking in directly C++ because it's a compiled language that requires that all variables and their names are known at compile time.

You can do it, but it's not trivial, and it will only work on
systems where a C++ compiler is installed. Basically, C++ is
compiled, so if you want to execute some string which is C++,
you'll have to compile it: write it to a file (probably with
additional boilerplate like some includes), invoke the compiler
(with system) to build a DLL, and then load the DLL.
In practice, even if you went to all that effort, you'd probably
find it less useful than you think. Because C++ requires static
declarations for just about everything, it's much harder to
write small snippets like that and have them usable. (You
would, for example, have to wrap it in a function, and generate
extern for all of the variables it uses.) C++ doesn't support
this sort of thing for software engineering reasons; the very
features which make languages like Python so flexible for small,
quickly written programs, cause severe maintenance problems when
used in large scale robust software. Different languages are
designed for different purposes. It's very easy to create
a mixed language system using Python for the top level glue
(where you would want a facility such as you describe), and
calling into C++ for the heavy work (where such a facility would
allow the user to trigger core dumps at will).
Alternatively, you can implement a small language in C++. For
keeping user defined variables, for example, use std::map,
rather than declaring the variables. (That is, after all, what
Python does under the hood.)

If you are planning to use strings as parameters,it is NOT a good idea!
you can simply use an id generator function and give an id to every string that you are gonna use ,or if you are using a class you can put that function in the constructor.
no need to that exec() at all!!

CoffeeScript-like language for C/C++

I've been looking for a language that provides the same functionality that Coffeescript has, but for C/C++. I mean a language that converts the code into C, but readable, just like CoffeeScript converts to Javascript, readable and all.

I think this is possible, and even desirable (I grudgingly deal with C++ when writing Node.js native modules), but more challenging than with a higher-level language like JavaScript.
What you're asking for is a language that would provide syntactic sugar without sacrificing performance or flexibility. Some syntactic sugars (say, syntactic whitespace or Ruby-style def/end blocks instead of curly braces) would be trivial to add. But adding anything more advanced, you'd run into two major hurdles: static typing, and garbage collection.
For instance, let's say that you wanted to add implicit returns. It seems like a small feature, but think about it: In order for the feature to be useful, you'd have to—at the very least—throw a compile-time error when the value of the last expression doesn't match the function's return type. That means that your compiler needs to inspect a line like
a->b
and figure out what type it is. That's possible in principle, but it's a heck of a lot more work than the CoffeeScript compiler does.
Or say you added list comprehensions. That means you're allocating an array whose length isn't known at compile-time, which means you'll need to later deallocate it yourself. So the syntactic sugar could actually hurt you. The rule "If you malloc it, you free it" doesn't work if the compiler is adding in the malloc for you, unless it can figure out where to put the free (which, again, is generally possible but would take a lot of work).
So, while I'd love to someone give C++ the CoffeeScript treatment, I don't expect it to happen any time soon—if ever. I think it's more likely that the world will eventually move on to something like D or Go for system-level programming.

I think OOC is probably the closest thing to Coffeescript for C. It's a programming language with a lot of the features you'd expect from dynamic languages (objects, first class functions, clean syntax) that compiles directly into C99.
http://ooc-lang.org/

One item missing from Jacindas list you might want to know about: Vala/Genie is a compiler targetting C with the GObject library implementing objects, written by Gnome. Vala is a C#-like syntax, and Genie a Python-like syntax, but for the rest they are the same system. It was actually created because bare C + GObject became too much of a pain to work with for the Gnome guys. Vala does objects and automatic memory management based on reference counting or ownership tracking, and a lot of other things you'd expect in a C# like language.
As for the CoffeeScript-like property, I just saw that there was an experimental feature to disable the dependency of the generated code on GObject, so it generates just plain C without any runtime dependencies. Doing so disables a number of more advanced OO features, but it still leaves you with a better syntax, a basic object system, and (semi-)automatic memory management.
I don't know how readable the output is, but if you run it through a pretty printer it might be very close to what you're looking for.

SugarCpp is a language which can compile to C++11. It should be what you are looking for. Visit https://github.com/curimit/SugarCpp for more details.

For Python specifically, take a look at this question:
Convert Python program to C/C++ code?
They mention Shed Skin, which will take a subset of pure python and convert to standalone C++ code.
Cython is typically used to create Python extension modules, but can create standalone programs if the Python interpreter is embedded. This doesn't sound like what you're looking for, though.
Cython is based on Pyrex, and they are compatible with each other in many ways.
For some of the other languages you mentioned there seem to be similar projects: Ruby and PHP. Toba for Java (though no longer maintained), Marst for Algol, BCX for BASIC, COB2C, PtoC for Pascal and I should probably stop there before this turns into "List of Converters from Foo to C/C++."
Hope that helps!

Take a look at this fresh new project: https://bixense.com/coffeepp/
Coffee++
Coffee++ is a little language that compiles into C++. It has been created to have something similar to CoffeeScript for C++. Currently Coffee++ is in a alpha state and not at all usable or final. Check out the source on Github to get involved.
The golden rule of Coffee++ is: "It's just C++". The code compiles one-to-one into the equivalent C++, and there is no runtime library. You can use any existing C++ library seamlessly from Coffee++ (and vice-versa).
Overview:
source file Test.cf++
include iostream
int main():
age := 5
dog := Dog(age)
if age != 7:
dog.bark()
class Dog:
public Dog(int age):
this->age := age
public void bark():
std::cout << "Woof!\n"
private int age
};
compiled Test.hpp
#pragma once
int main();
class Dog {
public:
Dog();
void bark();
private:
int age;
};
compiled Test.cpp
#include "test.hpp"
#include <iostream>
int main() {
auto age = 5;
auto dog = Dog(age);
if (age != 7) {
dog.bark();
}
}
Dog::Dog(int age) : age(age) {
}
void Dog::bark() {
std::cout << "Woof!\n";
}

Since vala and genie were already mentioned, I'll put BaCon (Basic Converter) out there for those who reminisce about hand coding programs from a monthly print publication, but want to use it with a modern GUI.
Must run on each Unix/Linux/BSD platform, including MacOSX
Converted sourcecode must be compilable with GCC
Must resemble genuine BASIC with implicit variable delarations
Spoken language constructs are preferred
The website http://www.basic-converter.org/ has lots of examples (some of theme pretty complex for "BASIC") and plugins for nearly every opensource IDE or you can use the BACON IDE.

Well, this is not what you want, but.. : http://www.campbell.nu/oscar/cython/index.html - This cython/cytoc is a significant space (pythonish) transpiler for C/C++ that I coded around 1999/2000, it has no relation to the cython project that arrived seven years later.
Frankly, I wrote it in Perl and it's heuristical, using regular expressions. I used it for an entire project of a Gameboy Color game (regular ansi C). But I wouldn't trust it... Which is why I'm looking around too, instead of using that dusty old bugger ;)
Follow up:
I've been working on Onyx (https://github.com/ozra/onyx-lang) for a year plus now, and finally realized the obvious thing to do is rewrite it to compile to C++ instead of LLVM-IR. The re-target idea is brand fresh, so rewrite is still vapor. But your input would be made well use of in RFC's, if you like the idea of the language, it's your chance to shape it.

Is there a tool that enables me to insert one line of code into all functions and methods in a C++-source file?

It should turn this
int Yada (int yada)
{
return yada;
}
into this
int Yada (int yada)
{
SOME_HEIDEGGER_QUOTE;
return yada;
}
but for all (or at least a big bunch of) syntactically legal C/C++ - function and method constructs.
Maybe you've heard of some Perl library that will allow me to perform these kinds of operations in a view lines of code.
My goal is to add a tracer to an old, but big C++ project in order to be able to debug it without a debugger.

Try Aspect C++ (www.aspectc.org). You can define an Aspect that will pick up every method execution.
In fact, the quickstart has pretty much exactly what you are after defined as an example:
http://www.aspectc.org/fileadmin/documentation/ac-quickref.pdf

If you build using GCC and the -pg flag, GCC will automatically issue a call to the mcount() function at the start of every function. In this function you can then inspect the return address to figure out where you were called from. This approach is used by the linux kernel function tracer (CONFIG_FUNCTION_TRACER). Note that this function should be written in assembler, and be careful to preserve all registers!
Also, note that this should be passed only in the build phase, not link, or GCC will add in the profiling libraries that normally implement mcount.

I would suggest using the gcc flag "-finstrument-functions". Basically, it automatically calls a specific function ("__cyg_profile_func_enter") upon entry to each function, and another function is called ("__cyg_profile_func_exit") upon exit of the function. Each function is passed a pointer to the function being entered/exited, and the function which called that one.
You can turn instrumenting off on a per-function or per-file basis... see the docs for details.
The feature goes back at least as far as version 3.0.4 (from February 2002).
This is intended to support profiling, but it does not appear to have side effects like -pg does (which compiles code suitable for profiling).
This could work quite well for your problem (tracing execution of a large program), but, unfortunately, it isn't as general purpose as it would have been if you could specify a macro. On the plus side, you don't need to worry about remembering to add your new code into the beginning of all new functions that are written.

There is no such tool that I am aware of. In order to recognise the correct insertion point, the tool would have to include a complete C++ parser - regular expressions are not enough to accomplish this.
But as there are a number of FOSS C++ parsers out there, such a tool could certainly be written - a sort of intelligent sed for C++ code. The biggest problem would probably be designing the specification language for the insert/update/delete operation - regexes are obviously not the answer, though they should certainly be included in the language somehow.
People are always asking here for ideas for projects - how about this for one?

I use this regex,
"(?<=[\\s:~])(\\w+)\\s*\\([\\w\\s,<>\\[\\].=&':/*]*?\\)\\s*(const)?\\s*{"
to locate the functions and add extra lines of code.
With that regex I also get the function name (group 1) and the arguments (group 2).
Note: you must filter out names like, "while", "do", "for", "switch".

This can be easily done with a program transformation system.
The DMS Software Reengineering Toolkit is a general purpose program transformation system, and can be used with many languages (C#, COBOL, Java, EcmaScript, Fortran, ..) as well as specifically with C++.
DMS parses source code (using full langauge front end, in this case for C++),
builds Abstract Syntax Trees, and allows you to apply source-to-source patterns to transform your code from one C# program into another with whatever properties you wish. THe transformation rule to accomplish exactly the task you specified would be:
domain CSharp.
insert_trace():function->function
"\visibility \returntype \fnname(int \parametername)
{ \body } "
->
"\visibility \returntype \fnname(int \parametername)
{ Heidigger(\CppString\(\methodname\),
\CppString\(\parametername\),
\parametername);
\body } "
The quote marks (") are not C++ quote marks; rather, they are "domain quotes", and indicate that the content inside the quote marks is C++ syntax (because we said, "domain CSharp"). The \foo notations are meta syntax.
This rule matches the AST representing the function, and rewrites that AST into the traced form. The resulting AST is then prettyprinted back into source form, which you can compile. You probably need other rules to handle other combinations of arguments; in fact, you'd probably generalize the argument processing to produce (where practical) a string value for each scalar argument.
It should be clear you can do a lot more than just logging with this, and a lot more than just aspect-oriented programming, since you can express arbitrary transformations and not just before-after actions.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js