Proper practice for setup upon load in R package development

Proper practice for setup upon load in R package development - c++

What is the correct way to go about automatically running some setup code (either in R or C++) once per package loading? Ideally, said code would execute once the user did library(mypackage). Right now, it's contained in a setup() function that needs to be run once before anything else.
Just for more context, in my specific case, I'm using an external library that uses glog and I need to execute google::InitGoogleLogging() once and only once. It's slightly awkward because I'm trying to use it within a library because I have to, even though it's supposed to be called from a main.

Just read 'Writing R Extensions' and follow the leads -- it is either .onAttach() or .onLoad(). I have lots of packages that do little things there -- and it doesn't matter this calls to C++ (via Rcpp or not) as you are simply asking about where to initialise things.
Example: Rblpapi creates a connection and stores it
.pkgenv <- new.env(parent=emptyenv())
.onAttach <- function(libname, pkgname) {
if (getOption("blpAutoConnect", FALSE)) {
con <- blpConnect()
if (getOption("blpVerbose", FALSE)) {
packageStartupMessage(paste0("Created and stored default connection object ",
"for Rblpapi version ",
packageDescription("Rblpapi")$Version, "."))
}
} else {
con <- NULL
}
assign("con", con, envir=.pkgenv)
}
I had some (not public) code that set up a handle (using C++ code) to a proprietary database the same way. The key is that these hooks guarantee you execution on package load / attach which is what you want here.

Related

c++ best way to realise global switches/flags to control program behaviour without tying the classes to a common point

Let me elaborate on the title:
I want to implement a system that would allow me to enable/disable/modify the general behavior of my program. Here are some examples:
I could switch off and on logging
I could change if my graphing program should use floating or pixel coordinates
I could change if my calculations should be based upon some method or some other method
I could enable/disable certain aspects like maybe a extension api
I could enable/disable some basic integrated profiler (if I had one)
These are some made-up examples.
Now I want to know what the most common solution for this sort of thing is.
I could imagine this working with some sort of singelton class that gets instanced globally or in some other globally available object. Another thing that would be possible would be just constexpr or other variables floating around in a namespace, again globally.
However doing something like that, globally, feels like bad practise.
second part of the question
This might sound like I cant decide what I want, but I want a way to modify all these switches/flags or whatever they are actually called in a single location, without tying any of my classes to it. I don't know if this is possible however.
Why don't I want to do that? Well I like to make my classes somewhat reusable and I don't like tying classes together, unless its required by the DRY principle and or inheritance. I basically couldn't get rid of the flags without modifying the possible hundreds of classes that used them.
What I have tried in the past
Having it all as compiler defines. This worked reasonably well, however I didnt like that I couldnt make it so if the flag file was gone there were some sort of default settings that would make the classes themselves still operational and changeable (through these default values)
Having it as a class and instancing it globally (system class). Worked ok, however I didnt like instancing anything globally. Also same problem as above
Instancing the system class locally and passing it to the classes on construction. This was kinda cool, since I could make multiple instruction sets. However at the same time that kinda ruined the point since it would lead to things that needed to have one flag set the same to have them set differently and therefore failing to properly work together. Also passing it on every construction was a pain.
A static class. This one worked ok for the longest time, however there is still the problem when there are missing dependencies.
Summary
Basically I am looking for a way to have a single "place" where I can mess with some values (bools, floats etc.) and that will change the behaviour of all classes using them for whatever, where said values either overwrite default values or get replaced by default values if said "place" isnt defined.

If a Singleton class does not work for you , maybe using a DI container may fit in your third approach? It may help with the construction and make the code more testable.
There are some DI frameworks for c++, like https://github.com/google/fruit/wiki or https://github.com/boost-experimental/di which you can use.

If you decide to use switch/flags, pay attention for "cyclometric complexity".
If you do not change the skeleton of your algorithm but only his behaviour according to the objets in parameter, have a look at "template design pattern". This method allow you to define a generic algorithm and specify particular step for a particular situation.

Here's an approach I found useful; I don't know if it's what you're looking for, but maybe it will give you some ideas.
First, I created a BehaviorFlags.h file that declares the following function:
// Returns true iff the given feature/behavior flag was specified for us to use
bool IsBehaviorFlagEnabled(const char * flagName);
The idea being that any code in any of your classes could call this function to find out if a particular behavior should be enabled or not. For example, you might put this code at the top of your ExtensionsAPI.cpp file:
#include "BehaviorFlags.h"
static const enableExtensionAPI = IsBehaviorFlagEnabled("enable_extensions_api");
[...]
void DoTheExtensionsAPIStuff()
{
if (enableExtensionsAPI == false) return;
[... otherwise do the extensions API stuff ...]
}
Note that the IsBehaviorFlagEnabled() call is only executed once at program startup, for best run-time efficiency; but you also have the option of calling IsBehaviorFlagEnabled() on every call to DoTheExtensionsAPIStuff(), if run-time efficiency is less important that being able to change your program's behavior without having to restart your program.
As far as how the IsBehaviorFlagEnabled() function itself is implemented, it looks something like this (simplified version for demonstration purposes):
bool IsBehaviorFlagEnabled(const char * fileName)
{
// Note: a real implementation would find the user's home directory
// using the proper API and not just rely on ~ to expand to the home-dir path
std::string filePath = "~/MyProgram_Settings/";
filePath += fileName;
FILE * fpIn = fopen(filePath.c_str(), "r"); // i.e. does the file exist?
bool ret = (fpIn != NULL);
fclose(fpIn);
return ret;
}
The idea being that if you want to change your program's behavior, you can do so by creating a file (or folder) in the ~/MyProgram_Settings directory with the appropriate name. E.g. if you want to enable your Extensions API, you could just do a
touch ~/MyProgram_Settings/enable_extensions_api
... and then re-start your program, and now IsBehaviorFlagEnabled("enable_extensions_api") returns true and so your Extensions API is enabled.
The benefits I see of doing it this way (as opposed to parsing a .ini file at startup or something like that) are:
There's no need to modify any "central header file" or "registry file" every time you add a new behavior-flag.
You don't have to put a ParseINIFile() function at the top of main() in order for your flags-functionality to work correctly.
You don't have to use a text editor or memorize a .ini syntax to change the program's behavior
In a pinch (e.g. no shell access) you can create/remove settings simply using the "New Folder" and "Delete" functionality of the desktop's window manager.
The settings are persistent across runs of the program (i.e. no need to specify the same command line arguments every time)
The settings are persistent across reboots of the computer
The flags can be easily modified by a script (via e.g. touch ~/MyProgram_Settings/blah or rm -f ~/MyProgram_Settings/blah) -- much easier than getting a shell script to correctly modify a .ini file
If you have code in multiple different .cpp files that needs to be controlled by the same flag-file, you can just call IsBehaviorFlagEnabled("that_file") from each of them; no need to have every call site refer to the same global boolean variable if you don't want them to.
Extra credit: If you're using a bug-tracker and therefore have bug/feature ticket numbers assigned to various issues, you can creep the elegance a little bit further by also adding a class like this one:
/** This class encapsulates a feature that can be selectively disabled/enabled by putting an
* "enable_behavior_xxxx" or "disable_behavior_xxxx" file into the ~/MyProgram_Settings folder.
*/
class ConditionalBehavior
{
public:
/** Constructor.
* #param bugNumber Bug-Tracker ID number associated with this bug/feature.
* #param defaultState If true, this beheavior will be enabled by default (i.e. if no corresponding
* file exists in ~/MyProgram_Settings). If false, it will be disabled by default.
* #param switchAtVersion If specified, this feature's default-enabled state will be inverted if
* GetMyProgramVersion() returns any version number greater than this.
*/
ConditionalBehavior(int bugNumber, bool defaultState, int switchAtVersion = -1)
{
if ((switchAtVersion >= 0)&&(GetMyProgramVersion() >= switchAtVersion)) _enabled = !_enabled;
std::string fn = defaultState ? "disable" : "enable";
fn += "_behavior_";
fn += to_string(bugNumber);
if ((IsBehaviorFlagEnabled(fn))
||(IsBehaviorFlagEnabled("enable_everything")))
{
_enabled = !_enabled;
printf("Note: %s Behavior #%i\n", _enabled?"Enabling":"Disabling", bugNumber);
}
}
/** Returns true iff this feature should be enabled. */
bool IsEnabled() const {return _enabled;}
private:
bool _enabled;
};
Then, in your ExtensionsAPI.cpp file, you might have something like this:
// Extensions API feature is tracker #4321; disabled by default for now
// but you can try it out via "touch ~/MyProgram_Settings/enable_feature_4321"
static const ConditionalBehavior _feature4321(4321, false);
// Also tracker #4222 is now enabled-by-default, but you can disable
// it manually via "touch ~/MyProgram_Settings/disable_feature_4222"
static const ConditionalBehavior _feature4222(4222, true);
[...]
void DoTheExtensionsAPIStuff()
{
if (_feature4321.IsEnabled() == false) return;
[... otherwise do the extensions API stuff ...]
}
... or if you know that you are planning to make your Extensions API enabled-by-default starting with version 4500 of your program, you can set it so that Extensions API will be enabled-by-default only if GetMyProgramVersion() returns 4500 or greater:
static ConditionalBehavior _feature4321(4321, false, 4500);
[...]
... also, if you wanted to get more elaborate, the API could be extended so that IsBehaviorFlagEnabled() can optionally return a string to the caller containing the contents of the file it found (if any), so that you could do shell commands like:
echo "opengl" > ~/MyProgram_Settings/graphics_renderer
... to tell your program to use OpenGL for its 3D graphics, or etc:
// In Renderer.cpp
std::string rendererType;
if (IsDebugFlagEnabled("graphics_renderer", &rendererType))
{
printf("The user wants me to use [%s] for rendering 3D graphics!\n", rendererType.c_str());
}
else printf("The user didn't specify what renderer to use.\n");

How can I debug a C++ DLL function, called from VBA, using Visual Studio

I have written a DLL function in C++, which I am calling from VBA (Excel).
How can I setup the Visual Studio properties to allow me to debug the function? I have tried specifying Excel, but that doesn’t seem to work.

You have two choices: "direct debug", or "attach".
I strongly prefer the "direct debug" approach for a long list of reasons omitted from here.
There are steps required on both the DLL and Excel/VBA sides, your posting is unclear if all of those steps are addressed.
There are variations on the following:
1) In VS, depending on the version, enter Project Settings, or Project Properties, or equivalent, in the "Debug (not release) Target", go to the Debug or Debugging settings. :
a) There will be an field called "Executable for debugging session", or "command", or something like that depending on VS ver. Here, enter the full path of your Excel exe
b) Optionally, if the same "test spread sheet" is used frequently, enter the full path of your xls (or whatever) in the field called "Command argument", or "program argument" or as in your VS ver.
You may need to surround this with double quotes (e.g. if there are spaces in your path/file names).
c) You can also set the output of your project to a Dir that is "addin helpful", such as a Dir called AddIn (c.f. having the DLL end up in Debug (or Release) Dirs)
... it is assumed that your DLL has all the bits required to export the functions, with the project being of type DLL, plus any DLLEXPORT and compiler directives, etc etc.
... the specifics of the DLLEXPORT settings (and related compiler switches) and Calling Convention will determine many things ... it is assumed you have done all that correctly and consistently (and especially consistently with what the Excel-side is expecting).
... your DLL may or may not have a DLL_Main, if it does, more discussion is required.
2) Before anything else, be sure to have created the Excel-side "interface" for your DLL, ie. the "Add-In". This can be either via .xla, or via .xll. I strongly suggest the .xla route as your first approach.
See the Excel help files etc for creating the .xla
Then, in your XLA's VBA Module(s), declare the functions/subs etc from your DLL. For example, if you have a DLL called Add2_DLL.dll, which contains an exported function "add2_pho_xl", then:
Public Declare Function Add2_Pho_XX Lib "E:\EclipseWorkSpace\Add2_DLL\Debug\Add2_DLL.dll" _
Alias "add2_pho_xl" (A As Double, B As Double) As Double
I have used the Alias approach here, for reasons required below.
In many instances, this declaration can be used directly as User Defined Function (UDF) in your sheets, etc. However, for a vast number of cases, you will need to create a VBA "spinner" function/sub that creates the "ultimate" UDF, and relies on this direct entry function (see below). This is also a very long story, but necessary where more complex matters are required (arrays, variants, etc etc).
NOTICE:
a) the DLL's full path is required unless special steps have been taken. If your Addin is for general distribution ... a much longer discussion is required.
b) the Alias must be the EXACT entry name of the function in your DLL. If you view near the end of the DLL (or .Def) files, and unless you set your DLL modules as Private, those will show the entry names expected on the DLL side.
In this example, the entry name is NOT "decorated" due to the choices in the Calling Convention and compiler switches, but it could look something like
"_add2_pho_xl_#08" etc depending your choices.
... in any case, you must use whatever the DLL has.
3) Once both the .xla and dll exist (it is best if they are in the same Dir), Excel must be "told" about the Add-In. The easiest approach is Excel/Tools/Addins, but there are various strategies for "registering" DLL functions.
4) CRUCIALLY, the argument list properties/declarations MUST BE CONGRUENT with BOTH those in your DLL and the Calling Convention. Three (of many possible) examples of "issues" are,
(i) A Boolean on the VBA-side is two-bytes, if the Bool/Logical on your DLL side is 1-byte, then the Debug will fail, since the two sides "cannot connect" properly.
(ii) Strings ... this can be a very long story. It depends if sent ByVal or ByRef, it depends if the DLL side has "hidden length" Args, etc. etc. Remember, VBA uses COM/OLE VBStrings, which have their own can of worms.
(iii) Variants/Objects: these require a tome onto themselves
5) If ALL (and likely more) of the above have gone well, then in VS set your break points, if required, and "Go" or "Start" the debug (depending on VS ver, etc.). This should launch Excel, and if you also set the target xls, it will launch too. Go to the cell(s) where you addin function (e.g. =add2_pho_XX(A1, B1) ) resides, and execute the cell (sometimes using the "fx" menu item is useful, but there are other ways also).
Some hints:
a) if the func execution crashes/hangs etc Excel and does not even arrive back to the VS side, then likely there is a Arg list conflict (e.g. you are passing a Double to an Int or a million other possibilities), or Calling Convention conflict, etc.
b) In general, you may (while in the VS debug session) simultaneously perform a VBA debug session. That is, after starting the VS bebug, entre the VBA IDE, and set break points in VBA UDF's, if a "spinner" UDF's have been created. For example, a VBA UDF that relies also on the DLL's function.
Private Function Add2_Pho( FirstNum as Double, SecondNum as Double, Optional lMakeRed as Variant) As Variant
'
'
Add2_Pho = Add2_Pho_XX( FirstNum, SecondNum ) ' this is the actual DLL func accessed via Delcare above
'
If( IsMissing(lMakeRed) ) Then
Else
If( cBool(lMakeRed) ) Then
If( Add2_Pho < 0 ) Then
'
' ' make the result Red, or something
'
End If
End If
End If
'
'
End Function
... here, setting a break point at the first line can be helpful to see if the UDF is even entered on the VBA side. If it is, click Continue in VBA, and see if it makes it to the VS side, if not, check Args, Calling Convention, etc again, etc etc
c) If the cell's content are #Value or some other unexpected result, then at least the UDF is "recognised" but not functioning correctly, either due to sheet->VBA issues, or VBA-> DLL issues, or after the return DLL-> VBA
d) Save often! and Use the VBA IDE's Debug/ Compile VBA Project before running anything just make sure VBA internal consistency.
e) Also, if you are using VBA/XLA's, then get a copy of CleanProject (VBA can mess up its internals sometimes, and this Tool will be a life saver)

Please make sure that Debug mode is the active mode.
How to debug your DLL with Excel/VBA

Tcl Extension Calling a VB.NET DLL

I have a need to create a Tcl extension that calls a managed .NET DLL/Class Library. Currently, the structure of my application is Tcl > DLL Wrapper (C++ CLR) > .NET Class Library (VB.NET), where ">" represents a function call.
My VB.NET DLL just takes a value and returns it back, keeping it simple for now. In the end, this will do some more advanced stuff that makes use of some .NET functionality.
Public Class TestClass
Public Function TestFunction(ByVal param As Integer) As Integer
Return param
End Function
End Class
My Tcl Extension (C++ CLR) creates an object of the type above
int TestCmd(ClientData data, Tcl_Interp *interp, int objc, Tcl_Obj *CONST objv[])
{
// Check the number of arguments
if (objc != 2) {
Tcl_WrongNumArgs(interp, 0, objv, "arg");
return TCL_ERROR;
}
int param, result;
if (Tcl_GetIntFromObj(interp, objv[1], &param) != TCL_OK)
return TCL_ERROR;
SimpleLibrary::TestClass^ myclass = gcnew SimpleLibrary::TestClass(); //System.IO.FileNotFoundException
result = myclass->TestFunction(param);
Tcl_SetObjResult(interp, Tcl_NewIntObj(result));
return TCL_OK;
}
And finally, my Tcl script loads the extension and calls the function.
load SimpleTclExtension.dll
TestCmd 2
If my VB.NET DLL is in the same directory as my extension DLL, the extension crashes when it instantiates a TestClass object. I've noticed if the VB.NET DLL is relocated to C:\Tcl\bin, the extension will find it, and TestCmd can be called just fine. The problem is that this will eventually need to be deployed across a number of PCs, and it's preferred not to mingle my application's files with another application's.
It seems like there should be some configuration settings that will fix this problem, but I'm not sure where. Any help is greatly appreciated.

Firstly, depending on just what kind of Tcl application you are using you may want to look at Eagle which is a implementation of Tcl in CLR.
I think you are bumping into .Net's desire to only load assemblies from your application's directory or its immediate subdirectories. The application here is the tclsh/wish executable which is why moving the .Net assembly makes it load. This is something you can fix with suitable manifests or calls to the API to permit assembly loading from alternate locations. In this case I think you will need to run some initialization code in your Tcl extension when it gets loaded into the Tcl interpreter to init the CLR and add the extensions location as a suitable place to load assemblies from. It has been a while since I was looking at this so I forgot the details but I think you want to look at the AppDomain object and check the assembly loading path properties associated with that or its child objects. Try AppDomain.RelativeSearchPath

To be more specific, Eagle includes Garuda which is a Tcl extension built specifically to allow calling .Net from Tcl

In R, is there a way to know the path to the source file where a given function is defined?

I'm writing a unit test in R, which needs to read some test data defined in the same directory. But I would also like to be able to run that unit test no matter what the current working directory happens to be.
Is there a way to tell R to load a file from here, where here is defined as the directory holding the source file of the function being executed?

Depends on how what you mean with "I'm writing a unit test". If you just source that function from wherever, David is right and I don't even see the need to do that as you know which directory it is.
I would include that function in a package, and then there are mechanisms in R allowing you to make the data available for loading or via lazy-loading. See section 1.1.5 (Data in packages) in the manual Writing R Extensions. This is the R-way of doing it.
Another option Gabor Grothendieck gave in this thread on the R mailing list, is to add following line at the top of a script :
this.dir <- dirname(parent.frame(2)$ofile)
This will give the directory of the file when sourced using source(). Gabor calls it a dirty hack, and I agree with him.
On a sidenote, check also following packages for unit testing in R :
RUnit
svUnit
testthat

If the functions aren't in a package, and sourced from files via source() then perhaps source references might provide something to work with. Argument keep.source = TRUE is required, and read the R Journal article by Duncan Murdoch
Here is a quick example:
> setwd("./Downloads/")
> source("../foo.R", keep.source=TRUE) ## if options("keep.source") is FALSE
> bar
function(a, b) {
a + b
}
> body(bar)
{
a + b
}
attr(,"srcfile")
../foo.R
attr(,"wholeSrcref")
bar <- function(a, b) {
a + b
}
> srcref <- attr(body(bar), "srcref")[[1]]
> attr(srcref, "srcfile")
../foo.R
> ls(attr(srcref, "srcfile"))
[1] "Enc" "encoding" "filename" "timestamp" "wd"
> attr(srcref, "srcfile")$filename
[1] "../foo.R"
> attr(srcref, "srcfile")$wd
[1] "/home/gavin/Downloads"
Of course this assumes you don't know where the sourced functions come from to be of any use and yet the functions need to be sourced...
If this is in a package, then you can have data in a ./data directory or arbitrary directories in ./inst/. You can use data() toload datasets from the former, and system.file() for any file in a package. See the relevant help pages.

There's no realistic hope of achieving this in full generality. Functions don't need to be loaded from files, they can be created dynamically by code, loaded from workspaces and so on.

embedded Lua loading modules programmatically (C++)

I am embedding Lua in a C++ application.
I have some modules (for now, simple .lua scripts) that I want to load programmatically, as the engine is being started, so that when the engine starts, the module(s) is/are available to scripts without them having to include a require 'xxx' at the top of the script.
In order to do this, I need to be able to programmatically (i.e. C++ end), ask the engine to load the modules, as part of the initialisation (or shortly thereafter).
Anyone knows how I can do this?

Hmm, I just use the simple approach: My C++ code just calls Lua's require function to pre-load the Lua scripts I want preloaded!
// funky = require ("funky")
//
lua_getfield (L, LUA_GLOBALSINDEX, "require"); // function
lua_pushstring (L, "funky"); // arg 0: module name
err = lua_pcall (L, 1, 1, 0);
// store funky module table in global var
lua_setfield (L, LUA_GLOBALSINDEX, "funky");
// ... later maybe handle a non-zero value of "err"
// (I actually use a helper function instead of lua_pcall
// that throws a C++ exception in the case of an error)
If you've got multiple modules to load, of course, put it in a loop... :)

The easiest way is to add and edit a copy of linit.c to your project.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Proper practice for setup upon load in R package development - c++

Related

c++ best way to realise global switches/flags to control program behaviour without tying the classes to a common point

How can I debug a C++ DLL function, called from VBA, using Visual Studio

Tcl Extension Calling a VB.NET DLL

In R, is there a way to know the path to the source file where a given function is defined?

embedded Lua loading modules programmatically (C++)

Categories

Resources