Get class annotation using gcc plugins - c++

I am creating a gcc plugin that analyse a C++ file after parsing it.
The plugin walks through the classes and generate some information about them.
The plug-in is working, this is how I walk through classes.
cp_binding_level* level(NAMESPACE_LEVEL(nameSpace));
for (decl = level->names; decl != 0; decl = TREE_CHAIN(decl)) {
tree type(TREE_TYPE(decl));
tree_code dc(TREE_CODE(decl));
tree_code tc;
if (dc == TYPE_DECL&& tc == RECORD_TYPE &&
!DECL_IS_BUILTIN (decl) && DECL_ARTIFICIAL (decl)) {
//Now we know this is a class
//Do something
}
}
I would like to choose which class he can analyse and which one he can't.
My first idea is to add some sort of annotation, that I would read when I parse the class, and decide to analyze it or not.
I never used any sort of annotation in C++, so I don't know if this is possible. If so how would you recommend me to use them, and to get the annotation inside the plug-in ?
If it's not, is there a good way to do what I need ?

It can be done, it is not too hard, and it is a pretty common thing to do using a GCC plugin.
First you must register a new attribute. GCC provides the PLUGIN_ATTRIBUTES callback as a convenient time to do so. Your callback function can then call register_attribute to register attributes. This is documented in the manual, just one link away from the spot you linked to.
With this function you register another callback that is called when your attribute is applied. You'll have to read some GCC header files or source to really understand what this function should do. But, it can easily track whether it is being applied to a class and, if so, make a note of this for later processing.

Related

Adding Custom c++ function in chromium and call them in browser

I am trying to write custom function in bootstrapper.cc under v8/src/init.
int helloworld(){
return 0;
}
When it try to call it from chromium console, it throws undefined.
Look around bootstrapper.cc to see how other built-in functions are installed. Examples you could look at include Array and DataView (or any other, really).
There is no way to simply define a C++ function of a given name and have that show up in JavaScript. Instead, you have to define a property on the global object; and the function itself needs to have the right calling convention, and process its parameters / prepare its return value appropriately so that it can be called from JavaScript. You can't just take or return an int.
If you find it inconvenient to work with C++, an alternative might be to develop a Chrome extension, which would allow you to use JavaScript for the implementation, and also remove the need to compile/maintain/update your own build (which is a lot of work!). There is no existing guide for how to extend V8 in the way you're asking, because that approach is so much work that we don't recommend doing it like this (though of course it is possible -- you just have to read enough of the existing C++ source to understand how it's done).

Use gcc plugins to modify the order of variable declarations

I know this is very hard to do, and that I should avoid that, but I have my reasons for this.
I want to modify the order of some field declarations in compilation time, for example :
class A {
char c;
int i;
}
must turn to :
class A {
int i;
char c;
}
if I chose to swap the order of i and c,
I want to know how to change the location of a field declaration having its tree
Anyone know how can I do this ??
thanks !
I use the g++ 4.9.2 version of plugins
If I was going to try this, I would try two different approaches.
Hook in to the PLUGIN_FINISH_TYPE event and rewrite the type there. To rewrite it, reorder the fields and force a relayout of the type. You'll have to read a bit of GCC source to understand how to invalidate the layout and force a new one.
If that didn't work, add a new pass that is run just after gimplification, and try to rewrite the types there. I suspect this is not likely to work, though.
Hook in to the PLUGIN_FINISH_TYPE event and rewrite the type there. To rewrite it, reorder the fields and force a relayout of the type. You'll have to read a bit of GCC source to understand how to invalidate the layout and force a new one.
This is implemented in randomize_layout_plugin.c in linux kernel.
This solution works but it breaks down dwarf debug information. Actually, in debug information, the order of members stay the same one as initially defined in the source code, but the structure is well shuffled in the binary.

Parsing C++ to make some changes in the code

I would like to write a small tool that takes a C++ program (a single .cpp file), finds the "main" function and adds 2 function calls to it, one in the beginning and one in the end.
How can this be done? Can I use g++'s parsing mechanism (or any other parser)?
If you want to make it solid, use clang's libraries.
As suggested by some commenters, let me put forward my idea as an answer:
So basically, the idea is:
... original .cpp file ...
#include <yourHeader>
namespace {
SpecialClass specialClassInstance;
}
Where SpecialClass is something like:
class SpecialClass {
public:
SpecialClass() {
firstFunction();
}
~SpecialClass() {
secondFunction();
}
}
This way, you don't need to parse the C++ file. Since you are declaring a global, its constructor will run before main starts and its destructor will run after main returns.
The downside is that you don't get to know the relative order of when your global is constructed compared to others. So if you need to guarantee that firstFunction is called
before any other constructor elsewhere in the entire program, you're out of luck.
I've heard the GCC parser is both hard to use and even harder to get at without invoking the whole toolchain. I would try the clang C/C++ parser (libparse), and the tutorials linked in this question.
Adding a function at the beginning of main() and at the end of main() is a bad idea. What if someone calls return in the middle?.
A better idea is to instantiate a class at the beginning of main() and let that class destructor do the call function you want called at the end. This would ensure that that function always get called.
If you have control of your main program, you can hack a script to do this, and that's by far the easiet way. Simply make sure the insertion points are obvious (odd comments, required placement of tokens, you choose) and unique (including outlawing general coding practices if you have to, to ensure the uniqueness you need is real). Then a dumb string hacking tool to read the source, find the unique markers, and insert your desired calls will work fine.
If the souce of the main program comes from others sources, and you don't have control, then to do this well you need a full C++ program transformation engine. You don't want to build this yourself, as just the C++ parser is an enormous effort to get right. Others here have mentioned Clang and GCC as answers.
An alternative is our DMS Software Reengineering Toolkit with its C++ front end. DMS, using its C++ front end, can parse code (for a variety of C++ dialects), builds ASTs, carry out full name/type resolution to determine the meaning/definition/use of all symbols. It provides procedural and source-to-source transformations to enable changes to the AST, and can regenerate compilable source code complete with original comments.

Executing certain code for every method call in C++

I have a C++ class I want to inspect. So, I would like to all methods print their parameters and the return, just before getting out.
The latter looks somewhat easy. If I do return() for everything, a macro
#define return(a) cout << (a) << endl; return (a)
would do it (might be wrong) if I padronize all returns to parenthesized (or whatever this may be called). If I want to take this out, just comment out the define.
However, printing inputs seems more difficult. Is there a way I can do it, using C++ structures or with a workaroud hack?
A few options come to mind:
Use a debugger.
Use the decorator pattern, as Space_C0wb0y suggested. However, this could be a lot of manual typing, since you'd have to duplicate all of the methods in the decorated class and add logging yourself. Maybe you could automate the creation of the decorator object by running doxygen on your class and then parsing its output...
Use aspect-oriented programming. (Logging, which is what you're wanting to do, is a common application of AOP.) Wikipedia lists a few AOP implementations for C++: AspectC++, XWeaver, and FeatureC++.
However, printing inputs seems more
difficult. Is there a way I can do it,
using C++ structures or with a
workaroud hack?
No.
Update: I'm going to lose some terseness in my answer by suggesting that you can probably achieve what you need by applying Design by Contract.
It sounds like you want to use a debugging utility to me. That will allow you to see all of the parameters that you want.
If you don't mind inserting some code by hand, you can create a class that:
logs entry to the method in the constructor
provides a method to dump arbitrary parameters
provides a method to record status
logs exit with recorded status in the destructor
The usage would look something like:
unsigned long long
factorial(unsigned long long n) {
Inspector inspect("factorial", __FILE__, __LINE__);
inspect.parameter("n", n);
if (n < 2) {
return inspect.result(1);
}
return inspect.result(n * fact(n-1));
}
Of course, you can write macros to declare the inspector and inspect the parameters. If you are working with a compiler that supports variable argument list macros, then you can get the result to look like:
unsigned long long
factorial(unsigned long long n) {
INJECT_INSPECTOR(n);
if (n < 2) {
return INSPECT_RETURN(1);
}
return INSPECT_RETURN(n * fact(n-1));
}
I'm not sure if you can get a cleaner solution without going to something like an AOP environment or some custom code generation tool.
If your methods are all virtual, you could use the decorator-pattern to achieve that in a very elegant way.
EDIT: From your comment above (you want the output for statistics) I conclude that you should definitely use the decorator-pattern. It is intended for this kind of stuff.
I would just use a logging library (or some macros) and insert manual logging calls. Unless your class has an inordinate number of methods, it's probably faster to get going with than developing and debugging more sophisticated solution.

How can I create objects based on dump file memory in a WinDbg extension?

I work on a large application, and frequently use WinDbg to diagnose issues based on a DMP file from a customer. I have written a few small extensions for WinDbg that have proved very useful for pulling bits of information out of DMP files. In my extension code I find myself dereferencing c++ class objects in the same way, over and over, by hand. For example:
Address = GetExpression("somemodule!somesymbol");
ReadMemory(Address, &addressOfPtr, sizeof(addressOfPtr), &cb);
// get the actual address
ReadMemory(addressOfObj, &addressOfObj, sizeof(addressOfObj), &cb);
ULONG offset;
ULONG addressOfField;
GetFieldOffset("somemodule!somesymbolclass", "somefield", &offset);
ReadMemory(addressOfObj+offset, &addressOfField, sizeof(addressOfField), &cb);
That works well, but as I have written more extensions, with greater functionality (and accessing more complicated objects in our applications DMP files), I have longed for a better solution. I have access to the source of our own application of course, so I figure there should be a way to copy an object out of a DMP file and use that memory to create an actual object in the debugger extension that I can call functions on (by linking in dlls from our application). This would save me the trouble of pulling things out of the DMP by hand.
Is this even possible? I tried obvious things like creating a new object in the extension, then overwriting it with a big ReadMemory directly from the DMP file. This seemed to put the data in the right fields, but freaked out when I tried to call a function. I figure I am missing something...maybe c++ pulls some vtable funky-ness that I don't know about? My code looks similar to this:
SomeClass* thisClass = SomeClass::New();
ReadMemory(addressOfObj, &(*thisClass), sizeof(*thisClass), &cb);
FOLLOWUP: It looks like POSSIBLY ExtRemoteTyped from EngExtCpp is what I want? Has anyone successfully used this? I need to google up some example code, but am not having much luck.
FOLLOWUP 2: I am pursuing two different routes of investigation on this.
1) I am looking into ExtRemoteTyped, but it appears this class is really just a helper for the ReadMemory/GetFieldOffset calls. Yes, it would help speed things up ALOT, but doesn't really help when it comes to recreating an object from a DMP file. Although documentation is slim, so I might be misunderstanding something.
2) I am also looking into trying to use ReadMemory to overwrite an object created in my extension with data from the DMP file. However, rather than using sizeof(*thisClass) as above, I was thinking I would only pick out the data elements, and leave the vtables untouched.
Interesting idea, but this would have a hope of working only on the simplest of objects. For example, if the object contains pointers or references to other objects (or vtables), those won't copy very well over to a new address space.
However, you might be able to get a 'proxy' object to work that when you call the proxy methods they make the appropriate calls to ReadMemory() to get the information. This sounds to be a fair bit of work, and I'd think it would have to be more or less a custom set of code for each class you wanted to proxy. There's probably a better way to go about this, but that's what came to me off the top of my head.
I ended up just following my initial hunch, and copying over the data from the dmp file into a new object. I made this better by making remote wrapper objects like this:
class SomeClassRemote : public SomeClass
{
protected:
SomeClassRemote (void);
SomeClassRemote (ULONG inRemoteAddress);
public:
static SomeClassRemote * New(ULONG inRemoteAddress);
virtual ~SomeClassRemote (void);
private:
ULONG m_Address;
};
And in the implementation:
SomeClassRemote::SomeClassRemote (ULONG inRemoteAddress)
{
ULONG cb;
m_Address = inRemoteAddress;
// copy in all the data to the new object, skipping the virtual function tables
ReadMemory(inRemoteAddress + 0x4, (PVOID) ((ULONG)&(*this) +0x4), sizeof(SomeClass) - 4, &cb);
}
SomeClassRemote::SomeClassRemote(void)
{
}
SomeClassRemote::~SomeClassRemote(void)
{
}
SomeClassRemote* SomeClassRemote::New(ULONG inRemoteAddress)
{
SomeClassRemote*x = new SomeClassRemote(inRemoteAddress);
return (x);
}
That is the basics, but then I add specific overrides in as necessary to grab more information from the dmp file. This technique allows me to pass these new remote objects back into our original source code for processing in various utility functions, cause they are derived from the original class.
It sure SEEMS like I should be able to templatize this somehow... but there always seems to be SOME reason that each class is implemented SLIGHTLY differently, for example some of our more complicated objects have a couple vtables, both of which have to be skipped.
I know getting memory dumps have always been the way to get information for diagnosing, but with ETW its lot more easy and you get a information along with call stacks which include information system calls and user code. MS has been doing this for all their products including Windows and VS.NET.
It is a non-intrusive way of debugging. I have done same debugging for very long and now with ETW I am able to solve most of customer issues without spending lot of time inside the debugger. These are my two cents.
I approached something similar when hacking a gdi leak tracer extension for windbg. I used an stl container for data storage in the client and needed a way to traverse the data from the extension. I ended up implementing the parts of the hash_map I needed directly on the extension side using ExtRemoteTyped which was satisfactory but took me awhile to figure out ;o)
Here is the source code.