Get current module inside FunctionPass llvm - llvm

I'm writing a function pass in LLVM and need to call the method Module::getOrInsertFunction. I need to access the module of the current function. How do I get it?

You can use getParent() function: http://llvm.org/docs/doxygen/html/classllvm_1_1GlobalValue.html#a9e1fc23a17e97d2d1732e753ae9251ac

Please refer to: http://llvm.org/docs/WritingAnLLVMPass.html
According to the documentation here,
To be explicit, FunctionPass subclasses are not allowed to:
1. Inspect or modify a Function other than the one currently being processed.
2. Add or remove Functions from the current Module.
3. Add or remove global variables from the current Module.
4. Maintain state across invocations of runOnFunction (including global data).
So, you cannot call getOrInsertFunction from inside a FunctionPass. You will need a ModulePass

Related

Change Name of LLVM Function

I have a LLVM Module object which contains a particular function that I would like to rename. Is there any way of simply changing the name of a Function?
Given a module, you can look up a specific function by name using the getFunction method, or you can iterate over all the functions in the module using begin() and end(). From there, Function inherits from Value, so you can just use the setName method change the name. This will also automatically update all the references and calls to it inside the same module.

Getting a reference to a Node module and work with it in a separate thread

Assuming I have 2 different sources:
node_module.cc
threaded_class.cc
node_module.cc is where I am calling NODE_MODULE to initialize my module. This module has a function that makes an instance of threaded_class.cc (in a separate thread). I understand that I need to use Lockers and Isolates to access v8 in a separate thread but my issue is bigger than that.
NODE_MODULE function is my only chance to catch the module's instance from my understanding. I found this article that uses a piece of code that is what I am exactly looking for. The author stores the module handle in a persistent object like this:
auto module_handle = Persistent<Object>::New(target);
But this either seems deprecated or not possible anymore. However I figured that it can be achieved like this:
auto module_handle = Persistent<Object>(context->GetIsolate() ,target);
However the latter, when I am trying to access its properties, are mostly private methods and properties, nothing worth to be used or I am not knowing how to use this.
My question is, is there any updated guide on how to properly handle these kind of stuff in writing a Node module? Or can you show me an example how I can pass my latter module_handle to my thread and use it for example for executing a js function called test?
I also want to know, what is the difference between NODE_MODULE and NODE_MODULE_CONTEXT_AWARE when initializing a node module?

Why we have to export the function used by spawn?

In Erlang and while dealing with process, you have to export the function used in spawn function.
-module(echo).
-export([start/0, loop/0]).
start() ->
spawn(echo, loop, []).
The reason from the book "Programming Erlang, 2nd Edition. page 188" is
"Note that we also have to export the argument of spawn from the module. This is a good practice because we will be able to change the internal details of the server without changing the client code.".
And in the book "Erlang Programming", page 121:
-module(frequency).
-export([start/0, stop/0, allocate/0, deallocate/1]).
-export([init/0]).
%% These are the start functions used to create and
%% initialize the server.
start() ->
register(frequency, spawn(frequency, init, [])).
init() ->
Frequencies = {get_frequencies(), []},
loop(Frequencies).
Remember that when spawning a process, you have to export the init/ 0 function as it is used by the spawn/3 BIF. We have put this function in a separate export clause to distinguish it from the client functions, which are supposed to be called from other modules.
Would you please explain to me the logic behind that reason?
short answer is: spawn is not 'language construction' it's library function.
It means 'spawn' is situated in another module, which does not have access to any functions in your module but exported.
You have to pass to 'spawn' function some way to start your code. It can be function value (ie spawn(fun() -> (any code you want, including any local functions invocations) end) ) or module/exported function name/arguments, which is visible from other modules.
The logic is quite straightforward. Yet confusion can easily arise as:
export does not exactly match object-oriented encapsulation and especially public methods;
several common patterns require to export functions not meant to be called by regular clients.
What export really does
Export has a very strict meaning: exported functions are the only functions that can be referred to by their fully qualified name, i.e. by module, function name and arity.
For example:
-module(m).
-export([f/0]).
f() -> foo.
f(_Arg) -> bar.
g() -> foobar.
You can call the first function with an expression such as m:f() but this wouldn't work for the other two functions. m:f(ok) and m:g() will fail with an error.
For this reason, the compiler will warn in the example above that f/1 and g/0 are not called and cannot be called (they are unused).
Functions can always be called from outside a module: functions are values and you can refer to a local function (within a module), and pass this value outside. For example, you can spawn a new process by using a non-exported function, using spawn/1. You could rewrite your example as follows:
start() ->
spawn(fun loop/0).
This doesn't require to export loop. Joe Armstrong in other editions of Programming Erlang explicitely suggests to transform the code as above to avoid exporting loop/0.
Common patterns requiring an export
Because exports are the only way to refer to a function by name from outside a module, there are two common patterns that require exported functions even if those functions are not part of a public API.
The example you mention is whenever you want to call a library function that takes a MFA, i.e. a module, a function name and a list of arguments. These library functions will refer to the function by its fully qualified name. In addition to spawn/3, you might encounter timer:apply_after/4.
Likewise, you can write functions that take MFA arguments, and call the function using apply/3.
Sometimes, there are variants of these library functions that directly take a 0-arity function value. This is the case with spawn, as mentioned above. apply/1 doesn't make sense as you would simply write F().
The other common case is behavior callbacks, and especially OTP behaviors. In this case, you will need to export the callback functions which are of course referred to by name.
Good practice is to use separate export attributes for these functions to make it clear these functions are not part of the regular interface of the module.
Exports and code change
There is a third common case for using exports beyond a public API: code changes.
Imagine you are writing a loop (e.g. a server loop). You would typically implement this as following:
-module(m).
-export([start/0]).
start() -> spawn(fun() -> loop(state) end).
loop(State) ->
NewState = receive ...
...
end,
loop(NewState). % not updatable !
This code cannot be updated, as the loop will never exit the module. The proper way would be to export loop/1 and perform a fully qualified call:
-module(m).
-export([start/0]).
-export([loop/1]).
start() -> spawn(fun() -> loop(state) end).
loop(State) ->
NewState = receive ...
...
end,
?MODULE:loop(NewState).
Indeed, when you refer to an exported function using its fully qualified name, the lookup is always performed against the latest version of the module. So this trick allows to jump to the newer version of the code at every iteration of the loop. Code updates are actually quite complex, and OTP, with its behaviors, does it right for you. It typically uses the same construct.
Conversely, when you call a function passed as a value, this is always from the version of the module that created this value. Joe Armstrong argues this is an advantage of spawn/3 over spawn/1 in a dedicated section of his book (8.10, Spawning with MFAs). He writes:
Most programs we write use spawn(Fun) to create a new process. This is fine provided we don’t want to dynamically upgrade our code. Sometimes we want to write code that can be upgraded as we run it. If we want to make sure that our code can be dynamically upgraded, then we have to use a different form of spawn.
This is far-fetched as when you spawn a new process, it starts immediately, and an update is unlikely to occur between the start of the new process and the moment the function value is created. Besides, Armstrong's statement is partly untrue: to make sure the code can dynamically be upgraded, spawn/1 will work as well (cf example above), the trick is not to use spawn/3, but to perform a fully qualified call (Joe Armstrong describes this in another section). spawn/3 has other advantages over spawn/1.
Still, the difference between passing a function by value and by name explains why there is no version of timer:apply_after/4 that takes a function by value, since there is a delay and the function by value might be old when the timer fires. Such a variant would actually be dangerous because at most two versions of a module: the current one of the old one. If you reload a module more than once, processes trying to call even older versions of the code will be killed. For this reason, you would often prefer MFAs and their exports to function values.
When you do a spawn you create a new completely new process with its own environment and thread of execution. This means that you are no longer executing "inside" the module where the spawn is called, so you must make an "outside" call into the module. the only functions in a module which can be called from the "outside" are exported functions, hence the spawned function must be exported.
It might seem a little strange seeing you are spawning a function in the same module but this is why.
I think it is important to remember that a module is just code and does not contain any deeper meaning than that, for example like a class in an OO language. So even if you have functions from the same module being executed in different processes, a very common occurrence, then there is no implicit connection between them. You still have to send messages between processes even if it is from/to functions in the same module.
EDIT:
About the last part of your question with the quote about putting export init/1 in a separate export declaration. There is no need to do this and it has no semantic significance, you can use as many or as few export declarations as you wish. So you could put all the functions in one export declaration or have a separate one for each function; it makes no difference.
The reason to split them is purely visual and for documentation purposes. You typically group functions which go together into separate export declarations to make it easier to see that they are a group. You also typically put "internal" exported functions, functions which aren't meant for the user to directly call, in a separate export declaration. In this case init/1 has to be exported for the spawn but is not meant to be called directly outside the spawn.
By having the user call the start/0 function to start the server and not have them explicitly spawn the init/1 function allows you to change the internals as you wish later on. The user only sees the start/0 function. Which is what the first quote is trying to say.
If you're wondering why you have to export anything and not have everything visible by default, it's because it's clearer to the user which functions they should call if you hide all the ones they shouldn't. That way, if you change your mind on the implementation, people using your code won't notice. Otherwise, there may be someone who is using a function that you want to change or eliminate.
For example, say you have a module:
-module(somemod).
useful() ->
helper().
helper() ->
i_am_helping.
And you want to change it to:
-module(somemod).
useful() ->
betterhelper().
betterhelper() ->
i_am_helping_more.
If people should only be calling useful, you should be able to make this change. However, if everything was exported, people might be depending on helper when they shouldn't be. This change will break their code when it shouldn't.

How does require in node.js deal with globals?

I just found out that if I require a module and store it as a global, I can overwrite methods and properties in the module as shown below:
global.passwordhelper_mock = require("helpers/password")
sinon.stub(passwordhelper_mock, "checkPassword").returns true
If I then require another module which in itself utilizes the above stubbed method, my stubbed version will be used.
How does the require function in node.js take notice to these globals? Why does it only work when I overwrite/stub a module that has been saved as a global?
Thanks
How does the require function in node.js take notice to these globals?
Somewhere inside the module there must be a call to module.exports.someObject = function(x) {...} in order for someObject to be come available globally.
Why does it only work when I overwrite/stub a module that has been saved as a global?
Not sure I follow here. If the object was hidden then you couldn't overwrite it. You can overwrite any object available to you, either a global object (e.g. console) or a property of any object available to you at runtime (e.g. console.log).

GDB break on object function call

I'm debugging an issue, and I want to break on every method call that has a specific object as the 'this' parameter. Is this possible in GDB?
It's easy. You can use command like b A::a if (this==0x28ff1e).
The this parameter should only be the methods that are included in the class itself. So you should just need to set breakpoints for all Of the methods of the class you are looking at. I'm not sure there is a simple way to do that though.
I want to break on every method call that has a specific object as the 'this' parameter
This means that you want to break on every member function of a particular class for which the object has been instantiated.
Let's say for convenience that all the member functions are defined in a particular cpp file such as myclass_implementation.cpp
You can use gdb to apply breakpoint on every function inside myclass_implementation.cpp this way:
rbreak myclass_implementation.cpp:.
Let's say you want to break on some specific functions such as getter functions which start with Get, then you can use gdb to apply breakpoints this way:
rbreak myclass_implementation.cpp:Get*