Create an LLVMModule from a single, existing function - llvm

I'm using the c API of LLVM to create libFoo.a, which should contain a single, trivial (x+42) function, to be called from my main programm.
I am able to create Foo.o and from there libFoo.a, however it ends up containing a few other functions which I don't want to be part of it, since they cause linking errors (e.g. due to multiple main functions).
Logically, I would expect two ways to solve this: Either by removing not needed functions, or by creating a new LLVMModule which just contains the function I need.
I'm creating Foo.o by calling LLVMTargetMachineEmitToFile(target_machine, ModuleWithTooManyFunctions, "Foo.o", LLVMCodeGenFileType::LLVMObjectFile, error_msg)
I have my desired function available by an LLVMValueRef.
I'm quite new to LLVM, so I don't know how to proceed from here.
My first solution (creating a new LLVMModule, calling LLVMAddFunction on it) got stuck when I noticed that it expects a TypeRef instead of a ValueRef.
Since I'm generally not too convinced about my attempt to solve this I thought I might ask here.

Related

C++ link time resource "allocation" without defines

I'm currently working on a C++ class for an ESP32. I want to implement resource allocation of the resources like: IO-Pins, available RMT channels and so on.
My idea is to do this with some kind of resource handler which checks this at compile time, but I have no good idea nor did I find anything about something like this yet.
To clarify my problem lets have an example of what I mean.
Microcontroller X has IO pins 1-5, each of these can be used by exactly one component.
Components don't know anything from each other an take the pin they should use as a ctor argument.
Now I want to have a class/method/... that checks if the pin, a component needs, is already allocated at compile time.
CompA a(5); //works well: 5 is not in use
CompB b(3); //same as before, without the next line it should compile
CompC c(5); //Pin 5 is already in use: does not compile!
Im not sure yet how to do so. My best guess (as I can't use defines here: users should be able to use it only by giving a parameter or template argument) is, that it might work with a template function, but I did not find any way of checking which other parameters have been passed to a template method/class yet.
Edit1: Parts of the program may be either autogenerated or user defined in a manner, they do not know about other pin usages. The allocation thus is a "security" feature which should disallow erroneous code. This should also forbid it, if the register functions are in different code pathes (even if they might exclude each other)
Edit2: I got a response, that compile time is wrong here as components might be compiled separated from another. So the only way to do so seems like a linker error.
A silly C-style method: you could desperately use __COUNTER__ as the constructor's argument. This dynamic macro increases itself after each appearance, starting with 0.
I hope there's a better solution.

Removal of unused or redundant code [duplicate]

This question already has answers here:
Listing Unused Symbols
(2 answers)
Closed 7 years ago.
How do I detect function definitions which are never getting called and delete them from the file and then save it?
Suppose I have only 1 CPP file as of now, which has a main() function and many other function definitions (function definition can also be inside main() ). If I were to write a program to parse this CPP file and check whether a function is getting called or not and delete if it is not getting called then what is(are) the way(s) to do it?
There are few ways that come to mind:
I would find out line numbers of beginning and end of main(). I can do it by maintaining a stack of opening and closing braces { and }.
Anything after main would be function definition. Then I can parse for function definitions. To do this I can parse it the following way:
< string >< open paren >< comma separated string(s) for arguments >< closing paren >
Once I have all the names of such functions as described in (2), I can make a map with its names as key and value as a bool, indicating whether a function is getting called once or not.
Finally parse the file once again to check for any calls for functions with their name as in this map. The function call can be from within main or from some other function. The value for the key (i.e. the function name) could be flagged according to whether a function is getting called or not.
I feel I have complicated my logic and it could be done in a smarter way. With the above logic it would be hard to find all the corner cases (there would be many). Also, there could be function pointers to make parsing logic difficult. If that's not enough, the function pointers could be typedefed too.
How do I go about designing my program? Are a map (to maintain filenames) and stack (to maintain braces) the right data structures or is there anything else more suitable to deal with it?
Note: I am not looking for any tool to do this. Nor do I want to use any library (if it exists to make things easy).
I think you should not try to build a C++ parser from scratch, becuse of other said in comments that is really hard. IMHO, you'd better start from CLang libraries, than can do the low-level parsing for you and work directly with the abstract syntax tree.
You could even use crange as an example of how to use them to produce a cross reference table.
Alternatively, you could directly use GNU global, because its gtags command directly generates definition and reference databases that you have to analyse.
IMHO those two ways would be simpler than creating a C++ parser from scratch.
The simplest approach for doing it yourself I can think of is:
Write a minimal parser that can identify functions. It just needs to detect the start and ending line of a function.
Programmatically comment out the first function, save to a temp file.
Try to compile the file by invoking the complier.
Check if there are compile errors, if yes, the function is called, if not, it is unused.
Continue with the next function.
This is a comment, rather than an answer, but I post it here because it's too long for a comment space.
There are lots of issues you should consider. First of all, you should not assume that main() is a first function in a source file.
Even if it is, there should be some functions header declarations before the main() so that the compiler can recognize their invocation in main.
Next, function's opening and closing brace needn't be in separate lines, they also needn't be the only characters in their lines. Generally, almost whole C++ code can be put in a single line!
Furthermore, functions can differ with parameters' types while having the same name (overloading), so you can't recognize which function is called if you don't parse the whole code down to the parameters' types. And even more: you will have to perform type lists matching with standard convertions/casts, possibly considering inline constructors calls. Of course you should not forget default parameters. Google for resolving overloaded function call, for example see an outline here
Additionally, there may be chains of unused functions. For example if a() calls b() and b() calls c() and d(), but a() itself is not called, then the whole four is unused, even though there exist 'calls' to b(), c() and d().
There is also a possibility that functions are called through a pointer, in which case you may be unable to find a call. Example:
int (*testfun)(int) = whattotest ? TestFun1 : TestFun2; // no call
int testResult = testfun(paramToTest); // unknown function called
Finally the code can be pretty obfuscated with #defineā€“s.
Conclusion: you'll probably have to write your own C++ compiler (except the machine code generator) to achieve your goal.
This is a very rough idea and I doubt it's very efficient but maybe it can help you get started. First traverse the file once, picking out any function names (I'm not entirely sure how you would do this). But once you have those names, traverse the file again, looking for the function name anywhere in the file, inside main and other functions too. If you find more than 1 instance it means that the function is being called and should be kept.

Changing what a function points to

I have been playing around with pointers and function pointers in c/c++. As you can get the adress of a function, can you change where a function call actually ends?
I tried getting the memory adress of a function, then writing a second functions adress to that location, but it gave me a access violation error.
Regards,
Function pointers are variables, just like ints and doubles. The address of a function is something different. It is the location of the beginning of the function in the .text section of the binary. You can assign the address of a function to a function pointer of the same type however the .text section is read only and therefore you can't modify it. Writing to the address of a function would attempt to overwrite the code at the beginning of the function and is therefore not allowed.
Note:
If you want to change, at runtime, where function calls end up you can create something called a vritual dispatch table, or vtable. This is a structure containing function pointers and is used in languages such as c++ for polymorphism.
e.g.:
struct VTable {
int (*foo)(void);
int (*bar)(int);
} vTbl;
At runtime you can change the values of vTbl.foo and vTbl.bar to point to different functions and any calls made to vTbl.foo() or .bar will be directed to the new functions.
If the function you're trying to call is inlined, then you're pretty much out of luck. However, if it's not inlined, then there may be a way:
On Unix systems there's a common feature of the dynamic linker called LD_PRELOAD which allows you to override functions in shared libraries with your own versions. See the question What is the LD_PRELOAD trick? for some discussion of this. If the function you're trying to hijack is not loaded from a shared library (i.e. if it's part of the executable or if it's coming from a statically linked library), you're probably out of luck.
On Windows, there are other attack vectors. If the function to be hooked is exported by some DLL, you could use Import Address Table Patching to hijack it without tinkering with the code of the function. If it's not exported by the DLL but you can get the address of it (i.e. by taking the address of a function) you could use something like the free (and highly recommended) N-CodeHook project.
In some environments, it is possible to "patch" the beginning instructions of a function to make the call go somewhere else. This is an unusual technique and is not used for normal programming. It is sometimes used if you have an existing compiled program and need to change how it interacts with the operating system.
Microsoft Detours is an example of a library that has the ability to this.
You can change what a function pointer points to, but you can't change a normal function nor can you change what the function contains.
You generally can't find where a function ends. There's no such standard functionality in the language and the compiler can optimize code in such ways that the function's code isn't contiguous and really has not a single point of end and in order to find where the code ends one would need to either use some non-standard tools or disassemble the code and make sense of it, which isn't something you can easily write a program for to do automatically.

Problem using accessors in V8

I'm writing a wrapper class around the V8 engine so that eventually I'll be able to do something like this
script->createClass("Test");
script->getClass("Test")->addFunction("funct1",testfunct1);
script->getClass("Test")->addVariable("x",setter,getter);
So far I can create classes and add functions to them and it works perfectly, however I have encountered a problem with adding variables.
My class template is stored as such
Persistent<Object> classInstance;
and I try to add an Accessor like this:
this->classInstance->SetAccessor(String::New(variableName),setter,getter);
Compiling this code gives me the error that v8::Object doesn't have a SetAccessor function (though I've seen doxygen documentation that says otherwise).
So my question is: How can I fix this? Is it possible to cast an Object to an ObjectTemplate?
SetAccessor on Object is available as of V8 2.2.12, which was released May 2010. (Before that, it was indeed only available on ObjectTemplate.) You should probably update your copy of V8.

C++ How to replace a function but still use the original function in it?

I want to modify the glBindTexture() function to keep track of the previously binded texture ID's. At the moment i just created new function for it, but then i realised that if i use other codes that use glBindTexture: then my whole system might go down to toilet.
So how do i do it?
Edit: Now when i thought it, checking if i should bind texture or not is quite useless since opengl probably does this already. But i still need to keep track on the previously used texture.
As Andreas is saying in the comment, you should check this is necessary. Still, if you want to do such a thing, and you use gnu linker (you don't specify the operating system) you could use the linker option:
--wrap glBindTexture
(if given directly to gcc you should write):
-Wl,--wrap,glBindTexture
As this is done at linker stage, you can use your new function with an already existing library (edit: by 'library' I mean some existing code which you can recompile but which you wouldn't want to modify).
The code for the 'replacement' function will look like:
void * __wrap_glBindTexture (GLenum target, GLuint texture) {
printf ("glBindTexture wrapper\n");
return __real_glBindTexture (target,texture);
}
You actually can do this. Take a look at LD_PRELOAD. Create a shared library that defines glBindTexture. To call the original implementation from within the wrapper, dlopen the real OpenGL library and use dlsym to call the right function from there.
Now have all client code LD_PRELOAD your shared lib so that their OpenGL calls go to your wrapper.
This is the most common method of intercepting and modifying calls to shared libraries.
You can intercept and replace all calls to glBindTexture. To do this you need to create your own OpenGL dll which intercepts all OpenGL function calls, does the bookkeeping you want and then forward the function calls to the real OpenGL dll. This is a lot of work so I would defintely think twice before going down this route...
Programs like GLIntercept work like this.
One possibility is to use a macro to replace existing calls to glBindTexture:
#define glBindTexture(target, texture) myGlBindTexture(target, texture)
Then in you code, where you want to ensure against using the macro, you surround the name with parentheses:
(glBindTexture)(someTarget, someTexture);
A function-like macro is only replace where the name is followed immediately by an open-parenthesis, so this prevents macro expansion.
Since this is a macro, it will only affect source code compiled with the macro visible, not an existing DLL, static library, etc.
I haven't ever worked with OpenGL, so not knowing anything about that function, here's my best guess. You would want to replace the glBindTexture function call with your new function's call anywhere it occurs in your code. If you use library functions that will call glBindTexture internally, then you should probably figure out a way to reverse what glBindTexture does. Then, anytime you call something that binds a texture, you can immediately call your reversal function to undo its changes.
The driver WON'T do it, it's in the spec. YOU have to ensure that you don't bind the same texture twice, so it's a good idea.
However, it's even a better idea to separate the concerns : let the low-level openGL deal with its low-level stuff, and your (thin, thick, as you want) abstraction layer do the higher-level stuff.
So, create a oglWrapper::BindTexture function that does the if(), but you should not play around with LD, even if this is technically possible.
[EDIT] In fact, it's not in the ogl spec, but still.
In general, the approaches have been catalogued under the heading of "seams", as popularized in M. Feather's 2004 book Working Effectively with Legacy Code. The book focuses on finding seams in a monolith application to isolate parts of it and put them under automated testing.
Feathers' seams can be found in the following places
compiler
__attribute__ ((ifunc in GCC, https://gcc.gnu.org/onlinedocs/gcc-4.7.2/gcc/Function-Attributes.html
preprocessor
change what gets used with a #define
linker
-Wl,--wrap,func_xyz
linking order, first found symbol gets used, program can delegate using dlsym(RTLD_NEXT, ...)
the binary format has a Procedure Linkage Table which can be modified by the program itself when it runs
in Java, much can be achieved in the JVM, see for example Mockito
language features
function pointers, this can actually be done so as to add no syntactic overhead at point of call!
object inheritance: inherit, override, call super()
sources:
https://www.informit.com/articles/article.aspx?p=359417&seqNum=3
https://www.cute-test.com/guides/mocking-with-cute/