Use gcc plugins to modify the order of variable declarations - c++

I know this is very hard to do, and that I should avoid that, but I have my reasons for this.
I want to modify the order of some field declarations in compilation time, for example :
class A {
char c;
int i;
}
must turn to :
class A {
int i;
char c;
}
if I chose to swap the order of i and c,
I want to know how to change the location of a field declaration having its tree
Anyone know how can I do this ??
thanks !
I use the g++ 4.9.2 version of plugins

If I was going to try this, I would try two different approaches.
Hook in to the PLUGIN_FINISH_TYPE event and rewrite the type there. To rewrite it, reorder the fields and force a relayout of the type. You'll have to read a bit of GCC source to understand how to invalidate the layout and force a new one.
If that didn't work, add a new pass that is run just after gimplification, and try to rewrite the types there. I suspect this is not likely to work, though.

Hook in to the PLUGIN_FINISH_TYPE event and rewrite the type there. To rewrite it, reorder the fields and force a relayout of the type. You'll have to read a bit of GCC source to understand how to invalidate the layout and force a new one.
This is implemented in randomize_layout_plugin.c in linux kernel.
This solution works but it breaks down dwarf debug information. Actually, in debug information, the order of members stay the same one as initially defined in the source code, but the structure is well shuffled in the binary.

Related

C++ link time resource "allocation" without defines

I'm currently working on a C++ class for an ESP32. I want to implement resource allocation of the resources like: IO-Pins, available RMT channels and so on.
My idea is to do this with some kind of resource handler which checks this at compile time, but I have no good idea nor did I find anything about something like this yet.
To clarify my problem lets have an example of what I mean.
Microcontroller X has IO pins 1-5, each of these can be used by exactly one component.
Components don't know anything from each other an take the pin they should use as a ctor argument.
Now I want to have a class/method/... that checks if the pin, a component needs, is already allocated at compile time.
CompA a(5); //works well: 5 is not in use
CompB b(3); //same as before, without the next line it should compile
CompC c(5); //Pin 5 is already in use: does not compile!
Im not sure yet how to do so. My best guess (as I can't use defines here: users should be able to use it only by giving a parameter or template argument) is, that it might work with a template function, but I did not find any way of checking which other parameters have been passed to a template method/class yet.
Edit1: Parts of the program may be either autogenerated or user defined in a manner, they do not know about other pin usages. The allocation thus is a "security" feature which should disallow erroneous code. This should also forbid it, if the register functions are in different code pathes (even if they might exclude each other)
Edit2: I got a response, that compile time is wrong here as components might be compiled separated from another. So the only way to do so seems like a linker error.
A silly C-style method: you could desperately use __COUNTER__ as the constructor's argument. This dynamic macro increases itself after each appearance, starting with 0.
I hope there's a better solution.

Get class annotation using gcc plugins

I am creating a gcc plugin that analyse a C++ file after parsing it.
The plugin walks through the classes and generate some information about them.
The plug-in is working, this is how I walk through classes.
cp_binding_level* level(NAMESPACE_LEVEL(nameSpace));
for (decl = level->names; decl != 0; decl = TREE_CHAIN(decl)) {
tree type(TREE_TYPE(decl));
tree_code dc(TREE_CODE(decl));
tree_code tc;
if (dc == TYPE_DECL&& tc == RECORD_TYPE &&
!DECL_IS_BUILTIN (decl) && DECL_ARTIFICIAL (decl)) {
//Now we know this is a class
//Do something
}
}
I would like to choose which class he can analyse and which one he can't.
My first idea is to add some sort of annotation, that I would read when I parse the class, and decide to analyze it or not.
I never used any sort of annotation in C++, so I don't know if this is possible. If so how would you recommend me to use them, and to get the annotation inside the plug-in ?
If it's not, is there a good way to do what I need ?
It can be done, it is not too hard, and it is a pretty common thing to do using a GCC plugin.
First you must register a new attribute. GCC provides the PLUGIN_ATTRIBUTES callback as a convenient time to do so. Your callback function can then call register_attribute to register attributes. This is documented in the manual, just one link away from the spot you linked to.
With this function you register another callback that is called when your attribute is applied. You'll have to read some GCC header files or source to really understand what this function should do. But, it can easily track whether it is being applied to a class and, if so, make a note of this for later processing.

Parsing C++ to make some changes in the code

I would like to write a small tool that takes a C++ program (a single .cpp file), finds the "main" function and adds 2 function calls to it, one in the beginning and one in the end.
How can this be done? Can I use g++'s parsing mechanism (or any other parser)?
If you want to make it solid, use clang's libraries.
As suggested by some commenters, let me put forward my idea as an answer:
So basically, the idea is:
... original .cpp file ...
#include <yourHeader>
namespace {
SpecialClass specialClassInstance;
}
Where SpecialClass is something like:
class SpecialClass {
public:
SpecialClass() {
firstFunction();
}
~SpecialClass() {
secondFunction();
}
}
This way, you don't need to parse the C++ file. Since you are declaring a global, its constructor will run before main starts and its destructor will run after main returns.
The downside is that you don't get to know the relative order of when your global is constructed compared to others. So if you need to guarantee that firstFunction is called
before any other constructor elsewhere in the entire program, you're out of luck.
I've heard the GCC parser is both hard to use and even harder to get at without invoking the whole toolchain. I would try the clang C/C++ parser (libparse), and the tutorials linked in this question.
Adding a function at the beginning of main() and at the end of main() is a bad idea. What if someone calls return in the middle?.
A better idea is to instantiate a class at the beginning of main() and let that class destructor do the call function you want called at the end. This would ensure that that function always get called.
If you have control of your main program, you can hack a script to do this, and that's by far the easiet way. Simply make sure the insertion points are obvious (odd comments, required placement of tokens, you choose) and unique (including outlawing general coding practices if you have to, to ensure the uniqueness you need is real). Then a dumb string hacking tool to read the source, find the unique markers, and insert your desired calls will work fine.
If the souce of the main program comes from others sources, and you don't have control, then to do this well you need a full C++ program transformation engine. You don't want to build this yourself, as just the C++ parser is an enormous effort to get right. Others here have mentioned Clang and GCC as answers.
An alternative is our DMS Software Reengineering Toolkit with its C++ front end. DMS, using its C++ front end, can parse code (for a variety of C++ dialects), builds ASTs, carry out full name/type resolution to determine the meaning/definition/use of all symbols. It provides procedural and source-to-source transformations to enable changes to the AST, and can regenerate compilable source code complete with original comments.

does cdb/windbg have an equivalent to autoexp.dat?

I'd like to change the way some types are displayed using either 'dt' or '??' in a manner similar to how you can do that with autoexp.dat. Is there a way to do this?
For example, I have a structure something like this:
struct Foo
{
union Bar
{
int a;
void *p;
} b;
};
And I've got an array of a few hundred of these, all of which I know point to a structure Bar. Is there any way to tell cdb that, in this expression anyway, that 'p' is a pointer to Bar? This is the kind of thing you could do with autoexp. (The concrete example here is that I've got a stashtable that can have keys of any type, but I know they keys are strings. the implementation stores them as void pointers).
Thanks in advance!
I don't think there's anything as simple as autoexp.dat.
You have a couple potential options - you can write a simple script file with the debugger commands to dump the data structure in the way you want and use the "$<filename" command (or one of its variants). Combined with user aliases you can get this to be pretty easy and natural to use.
The second option is quite a bit more involved, but with it comes much more power - write an extension DLL that dumps your data structure. For something like what you're talking about this is probably overkill. But you have immense power with debugger extensions (in fact, much of the power that comes in the Debugging tools package is implemented this way). The SDK is packaged with the debugger, so it's easy to determine if this is what you might need.
You can say du or da to have it dump memory as unicode or ascii strings.

Does an arbitrary instruction pointer reside in a specific function?

I have a very difficult problem I'm trying to solve: Let's say I have an arbitrary instruction pointer. I need to find out if that instruction pointer resides in a specific function (let's call it "Foo").
One approach to this would be to try to find the start and ending bounds of the function and see if the IP resides in it. The starting bound is easy to find:
void *start = &Foo;
The problem is, I don't know how to get the ending address of the function (or how "long" the function is, in bytes of assembly).
Does anyone have any ideas how you would get the "length" of a function, or a completely different way of doing this?
Let's assume that there is no SEH or C++ exception handling in the function. Also note that I am on a win32 platform, and have full access to the win32 api.
This won't work. You're presuming functions are contigous in memory and that one address will map to one function. The optimizer has a lot of leeway here and can move code from functions around the image.
If you have PDB files, you can use something like the dbghelp or DIA API's to figure this out. For instance, SymFromAddr. There may be some ambiguity here as a single address can map to multiple functions.
I've seen code that tries to do this before with something like:
#pragma optimize("", off)
void Foo()
{
}
void FooEnd()
{
}
#pragma optimize("", on)
And then FooEnd-Foo was used to compute the length of function Foo. This approach is incredibly error prone and still makes a lot of assumptions about exactly how the code is generated.
Look at the *.map file which can optionally be generated by the linker when it links the program, or at the program's debug (*.pdb) file.
OK, I haven't done assembly in about 15 years. Back then, I didn't do very much. Also, it was 680x0 asm. BUT...
Don't you just need to put a label before and after the function, take their addresses, subtract them for the function length, and then just compare the IP? I've seen the former done. The latter seems obvious.
If you're doing this in C, look first for debugging support --- ChrisW is spot on with map files, but also see if your C compiler's standard library provides anything for this low-level stuff -- most compilers provide tools for analysing the stack etc., for instance, even though it's not standard. Otherwise, try just using inline assembly, or wrapping the C function with an assembly file and a empty wrapper function with those labels.
The most simple solution is maintaining a state variable:
volatile int FOO_is_running = 0;
int Foo( int par ){
FOO_is_running = 1;
/* do the work */
FOO_is_running = 0;
return 0;
}
Here's how I do it, but it's using gcc/gdb.
$ gdb ImageWithSymbols
gdb> info line * 0xYourEIPhere
Edit: Formatting is giving me fits. Time for another beer.