How to run a module pass in LLVM - c++

I'm trying to find a way to optimize away empty global constructors. Previous optimizations will turn constructors into functions that do nothing. I need to add a new pass to remove these functions from llvm.global_ctors.
First, I tried optimizeGlobalCtorsList but this function doesn't actually call the callback I give it even though llvm.global_ctors is populated.
Then I tried running GlobalOptPass. I tried this:
llvm::GlobalOptPass pass;
llvm::ModuleAnalysisManager MAM{true};
pass.run(module, MAM);
This ends up dereferencing a null pointer in AnalysisManager::lookupPass. I think I need to perform some sort of initialization or registration but I don't know how to do that. All the references on "llvm pass registration" talk about registering the pass with opt. I don't want to do that. I just want to run the pass.

Look in lib/Transforms/IPO/PassManagerBuilder.cpp (or lib/Passes/PassBuilder.cpp for the new pass manager) to see how opt sets up its pass pipeline. The code for opt is in tools/opt/opt.cpp and is very small, delegating almost all of its work to the core libraries.
You could use opt as a template for your own tool, or you could hack on the pass building pipline to insert your pass where you want it.

Related

Calling a llvm pass outside of a pass

I am new to LLVM and C++ and was trying to write some code to perform static analysis. My static analysis needs access to memory dependence info, which in LLVM can be obtained using MemoryDependenceAnalysis. This analysis generates an object of type MemoryDependenceResults, which is precisely what I need. The only ways I've seen this object being obtained, though, is through an LLVM pass and that's not something I want. My impression is you have to write a pass to be able to use an existing pass. I was wondering is that true? Can I call a pass outside of a pass, i.e. regular code? Or alternatively can a llvm pass be invoked programmatically without needing to run the opt command?
What I need is a way to obtain this MemoryDependenceResults object in my program (which is not a pass) and then perform some more manipulations to it.
First, you can create a PassManager instance anywhere, add the pass into the manager and run it. Here are two PassManagers can be used - legacy or the new one. To make it simple, I recommend you try the legacy one first:
legacy::PassManager passManager;
passManager.add(new MemoryDependenceWrapperPass());
passManager.run(*module);
Second, if you wish to run a transform pass, you call it. But if you wish to run an analysis pass, you need at least a wrapper pass to get the analysis result since the API getAnalysis() is only available in a pass. (You can copy and rename MemoryDependenceWrapperPass to your version.)

Lazy evaluation for subset of class methods

I'm looking to make a general, lazy evaluation-esque procedure to streamline my code.
Right now, I have the ability to speed up the execution of mathematical functions - provided that I pre-process it by calling another method first. More concretely, given a function of the type:
const Eigen::MatrixXd<double, -1, -1> function_name(const Eigen::MatrixXd<double, -1, -1>& input)
I can pass this into another function, g, which will produce a new version of function_name g_p, which can be executed faster.
I would like to abstract all this busy-work away from the end-user. Ideally, I'd like to make a class such that when any function f matching function_name's method signature is called on any input (say, x), the following happens instead:
The class checks if f has been called before.
If it hasn't, it calls g(f), followed by g_p(x).
If it has, it just calls g_p(x)
This is tricky for two reasons. The first, is I don't know how to get a reference to the current method, or if that's even possible, and pass it to g. There might be a way around this, but passing one function to the other would be simplest/cleanest for me.
The second bigger issue is how to force the calls to g. I have read about the execute around pattern, which almost works for this purpose - except that, unless I'm understanding it wrong, it would be impossible to reference f in the surrounding function calls.
Is there any way to cleanly implement my dream class? I ideally want to eventually generalize beyond the type of function_name (perhaps with templates), but can take this one step at a time. I am also open to other solution to get the same functionality.
I don't think a "perfect" solution is possible in C++, for the following reasons.
If the calling site says:
result = object->f(x);
as compiled this will call into the unoptimized version. At this point you're pretty much hamstrung, since there's no way in C++ to change where a function call goes, that's determined at compile-time for static linkage, and at runtime via vtable lookup for virtual (dynamic) linkage. Whatever the case, it's not something you can directly alter. Other languages do allow this, e.g. Lua, and rather ironically C++'s great-grandfather BCPL also permits it. However C++ doesn't.
TL;DR to get a workable solution to this, you need to modify either the called function, or every calling site that uses one of these.
Long answer: you'll need to do one of two things. You can either offload the problem to the called class and make all functions look something like this:
const <return_type> myclass:f(x)
{
static auto unoptimized = [](x) -> <return_type>
{
// Do the optimizable heavy lifting here;
return whatever;
};
static auto optimized = g(unoptimized);
return optimized(x);
}
However I very strongly suspect this is exactly what you don't want to do, because assuming the end-user you're talking about is the author of the class, this fails your requirement to offload this from the end-user.
However, you can also solve it by using a template, but that requires modification to every place you call one of these. In essence you encapsulate the above logic in a template function, replacing unoptimized with the bare class member, and leaving most everything else alone. Then you just call the template function at the calling site, and it should work.
This does have the advantage of a relatively small change at the calling site:
result = object->f(x);
becomes either:
result = optimize(object->f, x);
or:
result = optimize(object->f)(x);
depending on how you set the optimize template up. It also has the advantage of no changes at all to the class.
So I guess it comes down to where you wan't to make the changes.
Yet another choice. Would it be an option to take the class as authored by the end user, and pass the cpp and h files through a custom pre-processor? That could go through the class and automatically make the changes outlined above, which then yields the advantage of no change needed at the calling site.

Where should the user-defined parameters of a framework be ?

I am kind of a newbie and I am creating a framework to evolve objects in C++ with an evolutionary algorithm.
An evolutionary algorithm evolves objects and tests them to get the best solution (for example, evolve the weights neural network and test it on sample data, so that in the end you get a network which has a good accuracy, without having trained it).
My problem is that there are lots of parameters for the algorithm (type of selection/crossover/mutation, probabilities for each of them...) and since it is a framework, the user should be able to easily access and modify them.
CURRENT SOLUTION
For now, I created a header file parameters.h of this form:
// DON'T CHANGE THESE PARAMETERS
//mutation type
#define FLIP 1
#define ADD_CONNECTION 2
#define RM_CONNECTION 3
// USER DEFINED
static const int TYPE_OF_MUTATION = FLIP;
The user modifies the static variables TYPE_OF_MUTATION and then my mutation function tests what the value of TYPE_OF_MUTATION is and calls the right mutation function.
This works well, but it has a few drawbacks:
when I change a parameter in this header and then call "make", no change is taken into account, I have to call "make clean" then "make". From what I saw, it is not a problem in the makefile but it is how building works. Even if it did re-build when I change a parameter, it would mean re-compile the whole project as these parameters are used everywhere; it is definitely not efficient.
if you want to run the genetic algorithm several times with different parameters, you have to run it a first time then save the results, change the parameters then run it a second time etc.
OTHER POSSIBILITIES
I thought about taking these parameters as arguments of the top-level function. The problem is that the function would then take 20 arguments or so, it doesn't seem really readable...
What I mean about the top-level function is that for now, the evolutionary algorithm is run simply by doing this:
PopulationManager myPop;
myPop.evolveIt();
If I defined the parameters as arguments, we would have something like:
PopulationManager myPop;
myPop.evolveIt(20,10,5,FLIP,9,8,2,3,TOURNAMENT,0,23,4);
You can see how hellish it may be to always define parameters in the right order !
CONCLUSION
The frameworks I know make you build your algorithm yourself from pre-defined functions, but the user shouldn't have to go through all the code to change parameters one by one.
It may be useful to indicate that this framework will be used internally, for a definite set of projects.
Any input about the best way to define these parameters is welcome !
If the options do not change I usually use a struct for this:
enum class MutationType {
Flip,
AddConnection,
RemoveConnection
};
struct Options {
// Documentation for mutation_type.
MutationType mutation_type = MutationType::Flip;
// Documentation for integer option.
int integer_option = 10;
};
And then provide a constructor that takes these options.
Options options;
options.mutation_type = MutationType::AddConnection;
PopulationManager population(options);
C++11 makes this really easy, because it allows specifying defaults for the options, so a user only needs to set the options that need to be different from the default.
Also note that I used an enum for the options, this ensures that the user can only use correct values.
This is a classic example of polymorphism. In your proposed implementation you're doing a switch on constant to decide which polymorphic mutation algorithm you will choose to decide how to mutate the parameter. In C++, the corresponding mechanisms are templates (static polymorphism) or virtual functions (dynamic polymorphism) to select the appropriate mutating algorithm to apply to the parameter.
The templates way has the advantage that everything is resolvable at compile time and the resulting mutating algorithm could be inlined entirely, depending on the implementation. What you give up is the ability to dynamically select parameter mutation algorithms at runtime.
The virtual function way has the advantage that you can defer the choice of mutation algorithm until runtime, allowing this to vary based on input from the user or whatnot. The disadvantage is that the mutation algorithm can no longer be inlined and you pay the cost of a virtual function call (an extra level of indirection) when you mutate the parameter.
If you want to see a real example of how "algorithmic mutation" can work, look at evolve.cpp in my Iterated Dynamics repository on github. This is C code converted to C++ so it is neither using templates nor using virtual functions. Instead it uses function pointers and a switch-on-constant to select the appropriate code. However, the idea is the same.
My recommendation would be to see if you can use static polymorphism (templates) first. From your initial description you were fixing the mutation at compile-time anyway, so you're not giving anything up.
If that was just a prototyping phase and you intended to support switching of mutation algorithms at runtime, then look at virtual functions. As the other answer recommended, please shun C-style coding like #define constants and instead use proper enums.
To solve the "long parameter list smell", the idea of packing all the parameters into a structure is a good one. You can achieve more readability on top of that by using the builder pattern to build up the structure of parameters in a more readable way than just assigning a bunch of values into a struct. In this blog post, I applied the builder pattern to the resource description structures in Direct3D. That allowed me to more directly express these "bags of data" with reasonable defaults and directly reveal my intent to override or replace default values with special values when necessary.

Python: How to check that...?

I'd like some advice on how to check for the correctness of the parameters I receive.
The checking is going to be done in C++, so if there's a good solution using Boost.Python (preferably) or the C API, please tell me about that. Otherwise, tell me what attributes the object should have to ensure that it meets the criteria.
So...
How do you check that an object is a function?
How do you check that an object is a bound method?
How do you check that an object is a class object?
How do you check that a class object is a child of another class?
When in doubt just work out how you would get the required effect by calling the usual Python builtins and translate it to C/C++. I'll just answer for Python, for C you would look up the global such as 'callable' and then call it like any other Python function.
Why would you care about it being a function rather than any other sort of callable? If you want you can find out if it is callable by using the builtin callable(f) but of course that won't tell you which arguments you need to pass when calling it. The best thing here is usually just to call it and see what happens.
isinstance(f, types.MethodType) but that won't help if it's a method of a builtin. Since there's no difference in how you call a function or a bound method you probably just want to check if it is callable as above.
isinstance(someclass, type) Note that this will include builtin types.
issubclass(someclass, baseclass)
I have two unconventional recommendations for you:
1) Don't check. The Python culture is to simply use objects as you need to, and if it doesn't work, then an exception will occur. Checking ahead of time adds overhead, and potentially limits how people can use your code because you're checking more strictly than you need to.
2) Don't check in C++. When combining Python and C (or C++), I recommend only doing things in C++ that need to be done there. Everything else should be done in Python. So check your parameters in a Python wrapper function, and then call an unchecked C++ entry point.

Removing a parameter list from f(list) with preprocessor

It seems to me that I saw something weird being done in a boost library and it ended up being exactly what I'm trying to do now. Can't find it though...
I want to create a macro that takes a signature and turns it into a function pointer:
void f(int,int) {}
...
void (*x)(int,int) = WHAT( (f(int,int)) );
x(2,4); // calls f()
I especially need this to work with member function pointers so that WHAT takes two params:
WHAT(ClassType, (f(int,int)); // results in static_cast<void (ClassType::*)(int,int)>(&ClassType::f)
It's not absolutely necessary in order to solve my problem, but it would make things a touch nicer.
This question has nothing, per-se, to do with function pointers. What needs to be done is to use the preprocessor to take "f(int,int)" and turn it into two different parts:
'f'
'(int,int)'
Why:
I've solved the problem brought up here: Generating Qt Q_OBJECT classes pragmatically
I've started a series of articles explaining how to do it:
http://crazyeddiecpp.blogspot.com/2011/01/quest-for-sane-signals-in-qt-step-1.html
http://crazyeddiecpp.blogspot.com/2011/01/quest-for-sane-signals-in-qt-step-2.html
The signature must be evaluated from, and match exactly, the "signal" that the user is attempting to connect with. Qt users are used to expressing this as SIGNAL(fun(param,param)), so something like connect_static(SIGINFO(object,fun(param,param)), [](int,int){}) wouldn't feel too strange.
In order to construct the signature I need to be able to pull it out of the arguments supplied. There's enough information to get the member function address (using C++0x's decltype) and fetch the signature in order to generate the appropriate wrapper but I can't see how to get it out. The closest I can come up with is SIGINFO(object, fun, (param,param)), which is probably good enough but I figured I'd ask here before considering it impossible to get the exact syntax I'd prefer.
What are you trying to do is impossible using standard preprocessor, unfortunately. There are a couple of reasons:
It is impossible to split parameters passed to a macro using custom character. They have to be comma delimited. Otherwise that could solve your problem instantly.
You cannot use preprocessor to define something that is not an identifier. Otherwise you could use double expansion where ( and ) is defined as , and split arguments on that as if it was passed as f, int, int,, then process it as variadic arguments.
Function pointer definition in C++ does not allow you to deduce the name given to defined type, unfortunately.
Going even further, even if you manage to create a function pointer, the code won't work for methods because in order to invoke a method, you need to have two pointers - pointer to the method and to the class instance. This means you have to have some wrapper around this stuff.
That is why QT is using its own tools like moc to generate glue code.
The closes thing you might have seen in Boost is probably Signals, Bind and Lambda libraries. It is ironic that those libraries are much more powerful than what you are trying to achieve, but at the same time they won’t allow you to achieve it the way you want it. For example, even if you could do what you want with the syntax you want, you won’t be able to “connect” a slot to a “signal” if signal has a different signature. At the same time, libraries from Boost I mentioned above totally allow that. For example, if your “slot” expects more parameters than “signal” provides, you can bind other objects to be passed when “slot” is invoked. Those libraries can also suppress extra parameters if “slot” does not expect them.
I’d say the best way from C++ prospective as for today is to use Boost Signal approach to implement event handling in GUI libraries. QT doesn’t use it for a number of reasons. First, it started in like 90-s when C++ was not that fancy. Plus, they have to parse your code in order to work with “slots” and “signals” in graphic designer.
It seems for me than instead of using macros or even worse – non-standard tools on top of C++ to generate code, and using the following:
void (*x)(int,int) = WHAT( (f(int,int)) );
It would be much better to do something like this:
void f (int x, int y, int z);
boost::function<void (int, int)> x = boost::bind (&f, _1, _2, 3);
x (1, 2);
Above will work for both functions and methods.