I’m writing a CallGraphSCCPass which needs dominator tree information on each function. My getAnalysisUsage is fairly straightforward:
virtual void getAnalysisUsage(AnalysisUsage& au) const override
{
au.setPreservesAll();
au.addRequired<DominatorTreeWrapperPass>();
}
The pass is registered like this:
char MyPass::ID = 0;
static RegisterPass<MyPass> tmp("My Pass", "Do fancy analysis", true, true);
INITIALIZE_PASS_BEGIN(MyPass, "My Pass", "Do fancy analysis", true, true)
INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
INITIALIZE_PASS_END(MyPass, "My Pass", "Do fancy analysis", true, true)
When I try to add my pass to a legacy::PassManager, it dies with this error message:
Unable to schedule 'Dominator Tree Construction' required by ‘My Pass'
Unable to schedule pass
UNREACHABLE executed at LegacyPassManager.cpp:1264!
I statically link LLVM to my program, and define the pass in my program, too.
Am I doing something wrong? Does it make sense to require the DominatorTreeWrapperPass from a CallGraphSCCPass?
I also sent the question on the LLVM ML, but the server appears to be down at the moment.
If it makes any difference, I'm using LLVM 3.7 trunk, up-to-date as of a few weeks ago.
CallGraphSCCPass appears to be a special case that doesn't support every analysis very well. The simplest thing to do is to convert the pass to a ModulePass, and use <llvm/ADT/SCCIterator.h> to construct call graph SCCs from runOnModule, like how CGPassManager does it.
virtual void getAnalysisUsage(AnalysisUsage& au) const override
{
au.addRequired<CallGraphWrapperPass>();
// rest of your analysis usage here...
}
virtual bool runOnModule(Module& m) override
{
CallGraph& cg = getAnalysis<CallGraphWrapperPass>().getCallGraph();
scc_iterator<CallGraph*> cgSccIter = scc_begin(&cg);
CallGraphSCC curSCC(&cgSccIter);
while (!cgSccIter.isAtEnd())
{
const vector<CallGraphNode*>& nodeVec = *cgSccIter;
curSCC.initialize(nodeVec.data(), nodeVec.data() + nodeVec.size());
runOnSCC(curSCC);
++cgSccIter;
}
return false;
}
bool runOnSCC(CallGraphSCC& scc)
{
// your stuff here
}
Module passes take no issue at requiring DominatorTreeWrapperPass, or other analyses like MemoryDependenceAnalysis. However, this naive implementation might not support modifications to the call graph like CGPassManager does.
CallGraphSCCPass is a ModulePass and hence is being handled and invoked by ModulePassManager, ModulePassManager invokes FunctionPassManager (which manages FunctionPasses) and gives the handle to it after invoking Module passes(in the other word ModulePasses queue before FunctionPasses in the PassManager's pipeline), so you can not ask PassManager for a functionPass while you are inside a ModulePass but you can ask for a ModulePass inside FunctionPass since they already have run. by the same reason you can not ask for a LoopPass inside a FunctionPass
Related
I have a value which is expensive to calculate and can be asked for ahead of time--something like a lazily initiated value whose initialization is actually done at the moment of definition, but in a different thread. My immediate thought was to use parallelism.-Task seems purpose-built for this exact use-case. So, let's put it in a class:
class Foo
{
import std.parallelism : Task,task;
static int calculate(int a, int b)
{
return a+b;
}
private Task!(calculate,int,int)* ourTask;
private int _val;
int val()
{
return ourTask.workForce();
}
this(int a, int b)
{
ourTask = task!calculate(a,b);
}
}
That seems all well and good... except when I want the task to be based on a non-static method, in which case I want to make the task a delegate, in which case I start having to do stuff like this:
private typeof(task(&classFunc)) working;
And then, as it turns out, typeof(task(&classFunc)), when it's asked for outside of a function body, is actually Task!(run,ReturnType!classFunc function(Parameters!classFunc))*, which you may notice is not the type actually returned by runtime function calls of that. That would be Task!(run,ReturnType!classFunc delegate(Parameters!classFunc))*, which requires me to cast to typeof(working) when I actually call task(&classFunc). This is all extremely hackish feeling.
This was my attempt at a general template solution:
/**
Provides a transparent wrapper that allows for lazy
setting of variables. When lazySet!!func(args) is called
on the value, the function will be called in a new thread;
as soon as the value's access is attempted, it'll return the
result of the task, blocking if it's not done calculating.
Accessing the value is as simple as using it like the
type it's templated for--see the unit test.
*/
shared struct LazySet(T)
{
/// You can set the value directly, as normal--this throws away the current task.
void opAssign(T n)
{
import core.atomic : atomicStore;
working = false;
atomicStore(_val,n);
}
import std.traits : ReturnType;
/**
Called the same way as std.parallelism.task;
after this is called, the next attempt to access
the value will result in the value being set from
the result of the given function before it's returned.
If the task isn't done, it'll wait on the task to be done
once accessed, using workForce.
*/
void lazySet(alias func,Args...)(Args args)
if(is(ReturnType!func == T))
{
import std.parallelism : task,taskPool;
auto t = task!func(args);
taskPool.put(t);
curTask = (() => t.workForce);
working = true;
}
/// ditto
void lazySet(F,Args...)(F fpOrDelegate, ref Args args)
if(is(ReturnType!F == T))
{
import std.parallelism : task,taskPool;
auto t = task(fpOrDelegate,args);
taskPool.put(t);
curTask = (() => t.workForce);
working = true;
}
private:
T _val;
T delegate() curTask;
bool working = false;
T val()
{
import core.atomic : atomicStore,atomicLoad;
if(working)
{
atomicStore(_val,curTask());
working = false;
}
return atomicLoad(_val);
}
// alias this is inherently public
alias val this;
}
This lets me call lazySet using any function, function pointer or delegate that returns T, and then it'll calculate the value in parallel and return it, fully calculated, next time anything tries to access the underlying value, exactly as I wanted. Unit tests I wrote to describe its functionality pass, etc., it works perfectly.
But one thing's bothering me:
curTask = (() => t.workForce);
Moving the Task around by way of creating a lambda on-the-spot that happens to have the Task in its context still seems like I'm trying to "pull one over" on the language, even if it's less "hackish-feeling" than all the casting from earlier.
Am I missing some obvious language feature that would allow me to do this more "elegantly"?
Templates that take an alias function parameter (such as the Task family) are finicky regarding their actual type, as they can receive any type of function as parameter (including in-place delegates that get inferred themselves). As the actual function that gets called is part of the type itself, you would have to pass it to your custom struct to be able to save the Task directly.
As for the legitimacy of your solution, there is nothing wrong with storing lambdas to interact with complicated (or "hidden") types later.
An alternative is to store a pointer to &t.workForce directly.
Also, in your T val() two threads could enter if(working) at the same time, but I guess due to the atomic store it wouldn't really break anything - anyway, that could be fixed by core.atomic.cas.
I am making an application in C++, and it requires a config file that will be read and interpreted on launch. It will contain things such as:
Module1=true
Now, my original plan was to store it all in variables and simply have
If(module1) {
DO_STUFF();
}
However this seems wasteful as it would be checking constantly for a value that would never change. Any ideas?
Optimize the code, only if you find a bottleneck with a profiler. Branch prediction should do its thing here, module1 never changes, so if you call it in a loop, even, there shouldn't be a noticeable performance loss.
If you want to experiment, you can branch once, and make a pointer point to the right function:
using func_ptr = void (*)();
func_ptr p = [](){};
if(module1)
p = DO_STUFF;
while(...)
p();
But this is just something to profile, look at the assembly...
There are also slower, but comfortable ways you could be storing the configuration, e.g. in an array with enumerated indexes, or a map. If I were to get some value in a loop, I'd do:
auto module1 = modules[MODULE1]; // array and enumeration
//auto module1 = modules.at("module1"); // map and string
while(...)
{
if(module1)
DO_STUFF;
...
}
So I'd end up with what you already have.
performance wise a boolean check is no problem, except you start doing it millions or billions of times. Maybe you can start merging code which belongs to module1, but other than that you'd have to check for it like you currently do
This really isn't an issue. If your program requires that Module1 should be true then let it check the value and continue on. It wont affect your performance unless it is being checked too many times.
One thing you could do is make an inline function if it being checked too many times. However, you will have to make sure the function shouldnt be too big otherwise it will be a bigger bottleneck
Sorry guys, didn't spot this when I looked it up:
MDSN
So I check the boolean once on launch and then I don't need to anymore as only the correct functions are launched.
Depending on how your program is set up and how the variables change the behaviour of the code you might be able to use function pointers:
if(Module1 == true)
{
std::function<void(int)> DoStuff = Module1Stuff;
}
And then later:
while(true)
{
DoStuff(ImportantVariable);
}
See http://en.cppreference.com/w/cpp/utility/functional/function for further reference.
Not that I think it'll help all that much but it's an alternative to try out at least.
This can be solved if you know the all use cases of the values you check. For example, if you've read your config file and module1 is true - you do one thing, if it is false - another. Let's start with example:
class ConfigFileWorker {
public:
virtual void run() = 0;
};
class WithModule1Worker {
public:
void run() final override {
// do stuff as if your `Module1` is true
}
};
class WithoutModule1Worker {
public:
void run() final override {
// do stuff as if your `Module1` is false
}
};
int main() {
std::unique_ptr<ConfigFileWorker> worker;
const bool Module1 = read_config_file(file, "Module1");
if (Module1) { // you check this only once during launch and just use `worker` all the time after
worker.reset(new WithModule1Worker);
} else {
worker.reset(new WithoutModule1Worker);
}
// here and after just use the pointer with `run()` - then you will not need to check the variable all the time, you'll just perform action.
}
So you have predefined behaviour for 2 cases (true and false) and just create an object of one of them during parsing the config file on launch. This is java-like code, but of course you may use function pointers, std::function and other abstractions instead of a base class, however, base class-option has more flexibility in my opinion.
I read through Google Mock: Return() a list of values and found out how to return a single element from a vector on each EXPECT_CALL, as such I wrote the following code which works:
{
testing::InSequence s1;
for (auto anElem:myVecCollection) {
EXPECT_CALL(myMockInstance, execute())
.WillOnce(testing::Return(anElem));
}
}
so far so good...
Now I read not to use EXPECT_CALL unless you need to. https://groups.google.com/forum/#!topic/googlemock/pRyZwyWmrRE
My use case, myMockInstance is really a stub providing data to the SUT(software under test).
However, a simple EXPECT_CALL to ON_CALL replacement will not work(??), since ON_CALL with WillByDefault only calculates the return type only once(??)
As such I tried setting up an ACTION.
ACTION_P(IncrementAndReturnPointee, p)
{
return (p)++;
}
ON_CALL(myMockInstance, execute())
.WillByDefault(testing::Return
(*(IncrementAndReturnPointee(myVecCollection.cbegin()))));
Clang gives
error: expected expression 'ACTION_P(IncrementAndReturnPointee, p)'
Then I tried setting up a functor and use the Invoke method on it.
struct Funct
{
Funct() : i(0){}
myClass mockFunc(std::vector<myClass> &aVecOfMyclass)
{
return aVecOfMyclass[i++];
}
int i;
};
Funct functor;
ON_CALL(myMockInstance, execute())
.WillByDefault(testing::Return(testing::Invoke(&functor, functor.mockFunc(myVecCollection))));
Clang gives
no matching function for call to 'ImplicitCast_'
: value_(::testing::internal::ImplicitCast_<Result>(value)) {}
Now , I am fairly new to google-mock but have used google-test extensively.
I am a bit lost with the Google-Mock doc. I wanted to know, whether I am on the right path, in terms of what I needed.
If one of you could point to me , which approach is the correct one; or whether I am even close to the right approach, I can take it from there and debug the "close to right approach" further.
Thanks
testing::Return is an action. Your code should look like:
ACTION_P(IncrementAndReturnPointee, p)
{
return *(p++);
}
ON_CALL(myMockInstance, execute())
.WillByDefault(IncrementAndReturnPointee(myVecCollection.cbegin()));
As a side note, it doesn't look like a good idea to use a finite collection myVecCollection. You will probably get a more robust test if you figure out an implementation of the action that creates a new element to return on the fly.
I'm implementing several Passes on the LLVM in order to add original optimization,
These Passes are based on FunctionPass and ModulePass.
Now, each Pass is invoked by corresponding opt command option which is
registerd by RegisterPass template.
In future, I'd like to these Passes to be invoked only by one opt command option.
My idea is as follows:
First, Function passes to run, and finally Module pass to run.
Each Function passes to use the former Function passes' analysis information.
The final Module pass to construct a new function using the former Function passes' result.
All of these Passes sequence to invoke by only one opt command option specifying the final Module pass.
I thought I could make it with addRequired method in the AnalysisUsage class.
However, it doesn't seem to work:
In the Function pass, several Function passes may be addRequired in the order.
In the Function pass, only one Module pass may be addRequired.
In the Function pass(X), Function pass and Module pass cannot be addRequired simultaneously.
i.e. opt command execution with option X causes to a lock status.
In the Module pass, only one Module pass may be addRequired.
In the Module pass(Y), Function pass(Z) cannot be addRequired.
i.e. opt command with option Y executes only Y, and Function pass(Z) is ignored.
I am not familiar to the Pass manager mechanism.
Anybody help me how to run the Function pass before the Module pass with only one opt command option?
The case of execution is shown below:-
$ opt -stats -load ~/samples/tryPass4.so -MPass4 hello2.ll -S -o tryPass4.ll -debug-pass=Structure
Pass Arguments: -targetlibinfo -datalayout -notti -basictti -x86tti -MPass4 -verify -verify-di -print-module
Target Library Information ↑
Data Layout -FPass4 doesn't appear here
No target information
Target independent code generator's TTI
X86 Target Transform Info
ModulePass Manager
Module Pass
Unnamed pass: implement Pass::getPassName()
FunctionPass Manager
Module Verifier
Debug Info Verifier
Print module to stderr
Pass Arguments: -FPass4 <- here -FPass4 appears, but not executed
FunctionPass Manager
Function Pass
***** Module Name : hello2.ll <- output from the Module pass
The source code for above is as follows:-
using namespace llvm;
namespace{
class tryFPass4 : public FunctionPass {
public :
static char ID;
tryFPass4() : FunctionPass(ID){}
~tryFPass4(){}
virtual bool runOnFunction(llvm::Function &F);
virtual void getAnalysisUsage(llvm::AnalysisUsage &AU) const;
};
class tryMPass4 : public ModulePass {
public :
static char ID;
tryMPass4() : ModulePass(ID){}
~tryMPass4(){}
virtual bool runOnModule(llvm::Module &M);
virtual void getAnalysisUsage(llvm::AnalysisUsage &AU) const;
};
}
bool tryFPass4::runOnFunction(Function &F) {
bool change = false;
....
return change;
}
bool tryMPass4::runOnModule(Module &M) {
bool change = false ;
....
return change;
}
void tryFPass4::getAnalysisUsage(AnalysisUsage &AU) const {
AU.setPreservesCFG();
}
void tryMPass4::getAnalysisUsage(AnalysisUsage &AU) const {
AU.setPreservesCFG();
AU.addRequired<tryFPass4>();
}
char tryFPass4::ID = 0;
static RegisterPass<tryFPass4> X("FPass4", "Function Pass", false, false);
char tryMPass4::ID = 0;
static RegisterPass<tryMPass4> Y("MPass4", "Module Pass", false, false);
I tried to simulate the problem here using LLVM 3.8.1.
I believe your Function pass gets to run here:
Module Pass
Unnamed pass: implement Pass::getPassName()
I do not know why it is marked as unnamed although getPassName is overriden.
A fine detail that you need to watch is that in order for the function pass to actually execute its runOnFunction method, you need to invoke the Function & specific method of getAnalysis as in:
getAnalysis<tryFPass4>(f); // where f is the current Function operating on
It seems if the dependent pass operates on a small unit of IR than the pass that requires it, it needs to be executed explicitly. I might be mistaken since I have not yet tried it with a BasicBlockPass required by a FunctionPass.
Let's say you have a function in C/C++, that behaves a certain way the first time it runs. And then, all other times it behaves another way (see below for example). After it runs the first time, the if statement becomes redundant and could be optimized away if speed is important. Is there any way to make this optimization?
bool val = true;
void function1() {
if (val == true) {
// do something
val = false;
}
else {
// do other stuff, val is never set to true again
}
}
gcc has a builtin function that let you inform the implementation about branch prediction:
__builtin_expect
http://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
For example in your case:
bool val = true;
void function1()
{
if (__builtin_expect(val, 0)) {
// do something
val = false;
}
else {
// do other stuff, val is never set to true again
}
}
You should only make the change if you're certain that it truly is a bottleneck. With branch-prediction, the if statement is probably instant, since it's a very predictable pattern.
That said, you can use callbacks:
#include <iostream>
using namespace std;
typedef void (*FunPtr) (void);
FunPtr method;
void subsequentRun()
{
std::cout << "subsequent call" << std::endl;
}
void firstRun()
{
std::cout << "first run" << std::endl;
method = subsequentRun;
}
int main()
{
method = firstRun;
method();
method();
method();
}
produces the output:
first run subsequent call subsequent call
You could use a function pointer but then it will require an indirect call in any case:
void (*yourFunction)(void) = &firstCall;
void firstCall() {
..
yourFunction = &otherCalls;
}
void otherCalls() {
..
}
void main()
{
yourFunction();
}
One possible method is to compile two different versions of the function (this can be done from a single function in the source with templates), and use a function pointer or object to decide at runtime. However, the pointer overhead will likely outweigh any potential gains unless your function is really expensive.
You could use a static member variable instead of a global variable..
Or, if the code you're running the first time changes something for all future uses (eg, opening a file?), you could use that change as a check to determine whether or not to run the code (ie, check if the file is open). This would save you the extra variable. Also, it might help with error checking - if for some reason the initial change is be unchanged by another operation (eg, the file is on removable media that is removed improperly), your check could try to re-do the change.
A compiler can only optimize what is known at compile time.
In your case, the value of val is only known at runtime, so it can't be optimized.
The if test is very quick, you shouldn't worry about optimizing it.
If you'd like to make the code a little bit cleaner you could make the variable local to the function using static:
void function() {
static bool firstRun = true;
if (firstRun) {
firstRun = false;
...
}
else {
...
}
}
On entering the function for the first time, firstRun would be true, and it would persist so each time the function is called, the firstRun variable will be the same instance as the ones before it (and will be false each subsequent time).
This could be used well with #ouah's solution.
Compilers like g++ (and I'm sure msvc) support generating profile data upon a first run, then using that data to better guess what branches are most likely to be followed, and optimizing accordingly. If you're using gcc, look at the -fprofile-generate option.
The expected behavior is that the compiler will optimize that if statement such that the else will be ordered first, thus avoiding the jmp operation on all your subsequent calls, making it pretty much as fast as if it wern't there, especially if you return somewhere in that else (thus avoiding having to jump past the 'if' statements)
One way to make this optimization is to split the function in two. Instead of:
void function1()
{
if (val == true) {
// do something
val = false;
} else {
// do other stuff
}
}
Do this:
void function1()
{
// do something
}
void function2()
{
// do other stuff
}
One thing you can do is put the logic into the constructor of an object, which is then defined static. If such a static object occurs in a block scope, the constructor is run the fist time that an execution of that scope takes place. The once-only check is emitted by the compiler.
You can also put static objects at file scope, and then they are initialized before main is called.
I'm giving this answer because perhaps you're not making effective use of C++ classes.
(Regarding C/C++, there is no such language. There is C and there is C++. Are you working in C that has to also compile as C++ (sometimes called, unofficially, "Clean C"), or are you really working in C++?)
What is "Clean C" and how does it differ from standard C?
To remain compiler INDEPENDENT you can code the parts of if() in one function and else{} in another. almost all compilers optimize the if() else{} - so, once the most LIKELY being the else{} - hence code the occasional executable code in if() and the rest in a separate function that's called in else