How does require in node.js deal with globals? - unit-testing

I just found out that if I require a module and store it as a global, I can overwrite methods and properties in the module as shown below:
global.passwordhelper_mock = require("helpers/password")
sinon.stub(passwordhelper_mock, "checkPassword").returns true
If I then require another module which in itself utilizes the above stubbed method, my stubbed version will be used.
How does the require function in node.js take notice to these globals? Why does it only work when I overwrite/stub a module that has been saved as a global?
Thanks

How does the require function in node.js take notice to these globals?
Somewhere inside the module there must be a call to module.exports.someObject = function(x) {...} in order for someObject to be come available globally.
Why does it only work when I overwrite/stub a module that has been saved as a global?
Not sure I follow here. If the object was hidden then you couldn't overwrite it. You can overwrite any object available to you, either a global object (e.g. console) or a property of any object available to you at runtime (e.g. console.log).

Related

Pybind11 Class Definition

What's the difference between the following class definitions in pybind11?
(1)
py::class_<Pet> pet(m, "Pet");
(2)
py::class_<Pet>(m, "Pet")
What's the use of pet in pet(m, "Pet")? I found this definition on page 42 section "5.8 Enumerations and internal types" of the documentation, which can be found here.
The first creates a named variable that you can refer to later within the same scope (as is done in the example that you reference), the second creates a (unnamed) temporary that you can only use by chaining the function calls that set more properties on the same statement. If the variable does not escape the local scope, then the only difference is syntax. Otherwise, by naming it, you could for example pass it along to one or more helper functions (e.g. for factoring out the definitions of common properties).
What is important to understand is that all Python classes, functions, etc. are run-time constructs. I.e. some Python API code needs to be called to create them, for example when the module is loaded. An object of py::class_ calls the APIs to create a Python class and to register some type info for internal pybind11 use (e.g. for casting later on). I.e. it is only a recipe to create the requested Python class, it is not that class itself. Once the Python class is created and its type info stored, the recipe object is no longer needed and can be safely destroyed (e.g. b/c by letting it go out of scope).

Get current module inside FunctionPass llvm

I'm writing a function pass in LLVM and need to call the method Module::getOrInsertFunction. I need to access the module of the current function. How do I get it?
You can use getParent() function: http://llvm.org/docs/doxygen/html/classllvm_1_1GlobalValue.html#a9e1fc23a17e97d2d1732e753ae9251ac
Please refer to: http://llvm.org/docs/WritingAnLLVMPass.html
According to the documentation here,
To be explicit, FunctionPass subclasses are not allowed to:
1. Inspect or modify a Function other than the one currently being processed.
2. Add or remove Functions from the current Module.
3. Add or remove global variables from the current Module.
4. Maintain state across invocations of runOnFunction (including global data).
So, you cannot call getOrInsertFunction from inside a FunctionPass. You will need a ModulePass

Getting a reference to a Node module and work with it in a separate thread

Assuming I have 2 different sources:
node_module.cc
threaded_class.cc
node_module.cc is where I am calling NODE_MODULE to initialize my module. This module has a function that makes an instance of threaded_class.cc (in a separate thread). I understand that I need to use Lockers and Isolates to access v8 in a separate thread but my issue is bigger than that.
NODE_MODULE function is my only chance to catch the module's instance from my understanding. I found this article that uses a piece of code that is what I am exactly looking for. The author stores the module handle in a persistent object like this:
auto module_handle = Persistent<Object>::New(target);
But this either seems deprecated or not possible anymore. However I figured that it can be achieved like this:
auto module_handle = Persistent<Object>(context->GetIsolate() ,target);
However the latter, when I am trying to access its properties, are mostly private methods and properties, nothing worth to be used or I am not knowing how to use this.
My question is, is there any updated guide on how to properly handle these kind of stuff in writing a Node module? Or can you show me an example how I can pass my latter module_handle to my thread and use it for example for executing a js function called test?
I also want to know, what is the difference between NODE_MODULE and NODE_MODULE_CONTEXT_AWARE when initializing a node module?

Why we have to export the function used by spawn?

In Erlang and while dealing with process, you have to export the function used in spawn function.
-module(echo).
-export([start/0, loop/0]).
start() ->
spawn(echo, loop, []).
The reason from the book "Programming Erlang, 2nd Edition. page 188" is
"Note that we also have to export the argument of spawn from the module. This is a good practice because we will be able to change the internal details of the server without changing the client code.".
And in the book "Erlang Programming", page 121:
-module(frequency).
-export([start/0, stop/0, allocate/0, deallocate/1]).
-export([init/0]).
%% These are the start functions used to create and
%% initialize the server.
start() ->
register(frequency, spawn(frequency, init, [])).
init() ->
Frequencies = {get_frequencies(), []},
loop(Frequencies).
Remember that when spawning a process, you have to export the init/ 0 function as it is used by the spawn/3 BIF. We have put this function in a separate export clause to distinguish it from the client functions, which are supposed to be called from other modules.
Would you please explain to me the logic behind that reason?
short answer is: spawn is not 'language construction' it's library function.
It means 'spawn' is situated in another module, which does not have access to any functions in your module but exported.
You have to pass to 'spawn' function some way to start your code. It can be function value (ie spawn(fun() -> (any code you want, including any local functions invocations) end) ) or module/exported function name/arguments, which is visible from other modules.
The logic is quite straightforward. Yet confusion can easily arise as:
export does not exactly match object-oriented encapsulation and especially public methods;
several common patterns require to export functions not meant to be called by regular clients.
What export really does
Export has a very strict meaning: exported functions are the only functions that can be referred to by their fully qualified name, i.e. by module, function name and arity.
For example:
-module(m).
-export([f/0]).
f() -> foo.
f(_Arg) -> bar.
g() -> foobar.
You can call the first function with an expression such as m:f() but this wouldn't work for the other two functions. m:f(ok) and m:g() will fail with an error.
For this reason, the compiler will warn in the example above that f/1 and g/0 are not called and cannot be called (they are unused).
Functions can always be called from outside a module: functions are values and you can refer to a local function (within a module), and pass this value outside. For example, you can spawn a new process by using a non-exported function, using spawn/1. You could rewrite your example as follows:
start() ->
spawn(fun loop/0).
This doesn't require to export loop. Joe Armstrong in other editions of Programming Erlang explicitely suggests to transform the code as above to avoid exporting loop/0.
Common patterns requiring an export
Because exports are the only way to refer to a function by name from outside a module, there are two common patterns that require exported functions even if those functions are not part of a public API.
The example you mention is whenever you want to call a library function that takes a MFA, i.e. a module, a function name and a list of arguments. These library functions will refer to the function by its fully qualified name. In addition to spawn/3, you might encounter timer:apply_after/4.
Likewise, you can write functions that take MFA arguments, and call the function using apply/3.
Sometimes, there are variants of these library functions that directly take a 0-arity function value. This is the case with spawn, as mentioned above. apply/1 doesn't make sense as you would simply write F().
The other common case is behavior callbacks, and especially OTP behaviors. In this case, you will need to export the callback functions which are of course referred to by name.
Good practice is to use separate export attributes for these functions to make it clear these functions are not part of the regular interface of the module.
Exports and code change
There is a third common case for using exports beyond a public API: code changes.
Imagine you are writing a loop (e.g. a server loop). You would typically implement this as following:
-module(m).
-export([start/0]).
start() -> spawn(fun() -> loop(state) end).
loop(State) ->
NewState = receive ...
...
end,
loop(NewState). % not updatable !
This code cannot be updated, as the loop will never exit the module. The proper way would be to export loop/1 and perform a fully qualified call:
-module(m).
-export([start/0]).
-export([loop/1]).
start() -> spawn(fun() -> loop(state) end).
loop(State) ->
NewState = receive ...
...
end,
?MODULE:loop(NewState).
Indeed, when you refer to an exported function using its fully qualified name, the lookup is always performed against the latest version of the module. So this trick allows to jump to the newer version of the code at every iteration of the loop. Code updates are actually quite complex, and OTP, with its behaviors, does it right for you. It typically uses the same construct.
Conversely, when you call a function passed as a value, this is always from the version of the module that created this value. Joe Armstrong argues this is an advantage of spawn/3 over spawn/1 in a dedicated section of his book (8.10, Spawning with MFAs). He writes:
Most programs we write use spawn(Fun) to create a new process. This is fine provided we don’t want to dynamically upgrade our code. Sometimes we want to write code that can be upgraded as we run it. If we want to make sure that our code can be dynamically upgraded, then we have to use a different form of spawn.
This is far-fetched as when you spawn a new process, it starts immediately, and an update is unlikely to occur between the start of the new process and the moment the function value is created. Besides, Armstrong's statement is partly untrue: to make sure the code can dynamically be upgraded, spawn/1 will work as well (cf example above), the trick is not to use spawn/3, but to perform a fully qualified call (Joe Armstrong describes this in another section). spawn/3 has other advantages over spawn/1.
Still, the difference between passing a function by value and by name explains why there is no version of timer:apply_after/4 that takes a function by value, since there is a delay and the function by value might be old when the timer fires. Such a variant would actually be dangerous because at most two versions of a module: the current one of the old one. If you reload a module more than once, processes trying to call even older versions of the code will be killed. For this reason, you would often prefer MFAs and their exports to function values.
When you do a spawn you create a new completely new process with its own environment and thread of execution. This means that you are no longer executing "inside" the module where the spawn is called, so you must make an "outside" call into the module. the only functions in a module which can be called from the "outside" are exported functions, hence the spawned function must be exported.
It might seem a little strange seeing you are spawning a function in the same module but this is why.
I think it is important to remember that a module is just code and does not contain any deeper meaning than that, for example like a class in an OO language. So even if you have functions from the same module being executed in different processes, a very common occurrence, then there is no implicit connection between them. You still have to send messages between processes even if it is from/to functions in the same module.
EDIT:
About the last part of your question with the quote about putting export init/1 in a separate export declaration. There is no need to do this and it has no semantic significance, you can use as many or as few export declarations as you wish. So you could put all the functions in one export declaration or have a separate one for each function; it makes no difference.
The reason to split them is purely visual and for documentation purposes. You typically group functions which go together into separate export declarations to make it easier to see that they are a group. You also typically put "internal" exported functions, functions which aren't meant for the user to directly call, in a separate export declaration. In this case init/1 has to be exported for the spawn but is not meant to be called directly outside the spawn.
By having the user call the start/0 function to start the server and not have them explicitly spawn the init/1 function allows you to change the internals as you wish later on. The user only sees the start/0 function. Which is what the first quote is trying to say.
If you're wondering why you have to export anything and not have everything visible by default, it's because it's clearer to the user which functions they should call if you hide all the ones they shouldn't. That way, if you change your mind on the implementation, people using your code won't notice. Otherwise, there may be someone who is using a function that you want to change or eliminate.
For example, say you have a module:
-module(somemod).
useful() ->
helper().
helper() ->
i_am_helping.
And you want to change it to:
-module(somemod).
useful() ->
betterhelper().
betterhelper() ->
i_am_helping_more.
If people should only be calling useful, you should be able to make this change. However, if everything was exported, people might be depending on helper when they shouldn't be. This change will break their code when it shouldn't.

How to automatically inject helper classes in each new module?

Developing a modular application, I want to inject some helper classes into each module. This should happen automated. Note that my helpers have state, so I can't just make them static and include them where needed.
I could store all helpers in a map with a string key and make it available to the abstract base class all modules inherit from.
std::unordered_map<std::string, void*> helpers;
RendererModule renderer = new RendererModule(helpers); // argument is passed to
// base class constructor
Then inside a module, I could access helpers like this.
std::string file = (FileHelper*)helpers["file"]->Read("C:/file.txt");
But instead, I would like to access the helpers like this.
std::string file = File->Read("C:/file.txt");
To do so, at the moment I separately define members for all helpers in the module base class, and set them for each specific module.
FileHelper file = new FileHelper(); // some helper instances are passed to
// multiple modules, while others are
// newly created for each one
RendererModule renderer = new RendererModule();
renderer->File = file;
Is there a way to automate this, so that I don't have to change to module code when adding a new helper to the application, while remaining with the second syntax? I an not that familiar with C macros, so I don't know if they are capable of that.
I think I see what your dilemma is, but I have no good solution for it. However, since there are no other answers, I will contribute my two cents.
I use the combination of a few strategies to help me with these kinds of problems:
If the helper instance is truly module-specific, I let the module itself create and manage it inside.
If I don't want the module to know about the creation or destruction of the helper(s), or if the lifetime of the helper instance is not tied to the module that is using it, or if I want to share a helper instance among several modules, I create it outside and pass the reference to the entry-point constructor of the module. Passing it to the constructor has the advantage of making the dependency explicit.
If the number of the helpers are high (say more than 2-3) I create an encompassing struct (or simple class) that just contains all the pointers and pass that struct into the constructor of the module or subsystem. For example:
struct Platform { // I sometimes call it "Environment", etc.
FileHelper * file;
LogHelper * log;
MemoryHelper * mem;
StatsHelper * stats;
};
Note: this is not a particularly nice or safe solution, but it's no worse than managing disparate pointers and it is straightforward.
All the above assumes that helpers have no dependency on modules (i.e. they are on a lower abstraction of dependency level and know nothing about modules.) If some helpers are closer to modules, that is, if you start to want to inject module-on-module dependencies into each other, the above strategies really break down.
In these cases (which obviously happen a lot) I have found that a centralized ModuleManager singleton (probably a global object) is the best. You explicitly register your modules into it, along with explicit order of initialization, and it constructs all the modules. The modules can ask this ModuleManager for a reference to other modules by name (kind of like a map of strings to module pointers,) but they do this once and store the pointers internally in any way they want for convenient and fast access.
However, to prevent messy lifetime and order-of-destruction issues, any time a module is constructed or destructed, the ModuleManager notifies all other modules via callbacks, so they have the chance to update their internal pointers to avoid dangling pointers and other problems.
That's it. By the way, you might want to investigate articles and implementations related to the "service locator" pattern.