Guidelines for including TMB c++ code in an R package

Guidelines for including TMB c++ code in an R package - c++

I've recently discovered the wonders of TMB and I'm working on a package which would ideally include TMB c++ templates in it for rather computationally expensive models.
I'm assuming that there's a possibility of:
Automatically compiling the TMB source code on package install
but I can't find any clear guidelines in the TMB documentation regarding this. As of now, my alternative is to write functions that compile the TMB code upon the first call of a function which uses an uncompiled class... but I have a feeling there are nicer ways to do this.
Has anyone successfully included TMB functions within another package and could point me in the direction of relevant documentation or examples?

With a bit more searching i finally found my answer in this thread. I guess I missed it because the resolutions it details were moved to the wiki page titled development, where the content is specifically targeted for users wishing to contribute to the development of TMB, whereas I just want to distribute code which incorperates TMB.
To summarize, the thread suggests some changes which I adopted like this (myPkg should be the name of your package):
src/
Place your .cpp template in mypkg/src. This will then be automatically compiled by R when you build your package.
DESCRIPTION
Add these lines to your description file so R has all the tools necessary to compile the model template.
Depends: TMB, RcppEigen
LinkingTo: TMB, RcppEigen
R/roxygentags.r
Now we need to add our TMB template to the namespace file. We can do this easily through roxygen by making a dummy file like so:
#' Roxygen commands
#'
#' #useDynLib myPkg
#'
dummy <- function(){
return(NULL)
}
The dummy function is just an excuse to have the tag #useDynLib myPkg somewhere in my source code where I wont mess with it. This tag will populate your NAMESPACE with useDynLib(myPkg)... and as I understand, this loads the shared libraries upon loading the package for you.
Calling the function in your package:
Finally, when calling MakeADFun, set DLL="myPkg". With this setup, you can compile a single TMB model into your package. This is because the content compiled in your ./src/ folder will automatically be renamed according to your package name, thus you cannot create uniquely named models.
EDIT: Solution for distributing multiple DLLs
After some more searching (same thread as referenced above)... I realized that solution described in the official wiki (and detailed above) is only relevant for distributing a single dll (i.e. a single TMB model).
If you want to distribute multiple TMB models in a package, you'll have to use your own makefile. I've given a more detailed description in my blog, so I'll only briefly describe the steps here with regard to how they differ from the previous steps I described.
src/Makefile
You'll have to define your own Makefile (or Makefile.win for windows users) and drop it in your src/ directory. Here's an example that works for me:
all: template1.so template2.so
# Comment here preserves the prior tab
template1.so: template1.cpp
Rscript --vanilla -e "TMB::compile('template1.cpp','-O0 -g')"
template2.so: template2.cpp
Rscript --vanilla -e "TMB::compile('template2.cpp','-O0 -g')"
clean:
rm -rf *o
For windows, replace so, with dll, and use the relevant compiler flags (for debugging). See ?TMB::compile for info regarding compiler flags for debugging.
R/roxygentags.r
This is slightly different than above:
#' Roxygen commands
#'
#' This is a dummy function who's purpose is to hold the useDynLib roxygen tag.
#' This tag will populate the namespace with compiled c++ functions upon package install.
#'
#' #useDynLib template1
#' #useDynLib template2
#'
dummy <- function(){
return(NULL)
}
Using your models in the package
Finally, the above changes will compile multiple uniquely named TMB templates and load them into the namespace. To call these models in your package, here's an example:
obj <- MakeADFun(data = data,
parameters = params,
DLL="template1",
inner.control = list(maxit = 10000),
silent=F)
Tips...
I had issues when I tried compiling this on a windows machine... it turned out to be related to not properly cleaning the src folder and I had old linux compiled files stuck in there. If you have compilation issues, its worth manually cleaning out the residual files in your src/ directory from previous builds... or perhaps someone can give some good advice on writing a better make file!

If you want access to the CppAD library with the additional code from TMB (which is quite substantial!) then you can use the WITH_LIBTMB macro variable as I do in this header here. This will allow you to have multiple .cpp files which you can compile separately. Importantly, you only need to compile the code from the TMB header once using a file like this which #includes the TMB.hpp header without defining WITH_LIBTMB.
This reduces the compilation time substantively as you can compile each .cpp on its own without all the code which is declared in TMB.hpp. Moreover, you can also use the code with Rcpp if you undefine and define a few macros as I do in the link.
You can also have one file which can used by TMB::MakeADFun. It requires a bit of manual work but can be done whilst also using Rcpp by using Rcpp::compileAttributes and changing the created file called RcppExports.cpp to instead be named init.cpp and then include these additional lines in the CallEntries array and R_init_survTMB function:
CallEntries array.
R_init_survTMB function.
Note on using Rstudio
Rstudio calls Rcpp::compileAttributes (or something similar) each time you build. Hence, you cannot use this. One way around this is to create a custom build script similar to the one here. It essentially calls R CMD INSTALL after having removed the RcppExports.cpp file created by Rcpp::compileAttributes. I also like to run the tests by calling devtools::test() but you can remove this if you like.

Related

Using ocamldoc with packs

I have an ocamlbuild project which includes some files in a subdirectory with an .mlpack file listing them.
e.g. I have a file support/logging.ml which defines the module Support.Logging. The _tags file says "support": for-pack(Support).
This all builds and runs fine. But how can I generate docs for this using ocamldoc?
The most recent post I found was ocamldoc generation and packed files from 2011, which suggests using ocp-pack to generate one large .ml file and pass that to ocamldoc. However, that doesn't take into account the build order, so the generated module doesn't work due to forward references.
What's the best way to handle this?

The problem is described in the following bugreport. Handling -pack inside ocamldoc requires an implementation effort that the maintainer is not motivated to perform, and so far nobody stepped up to contribute a patch for this feature.
In the meantime, you can easily copy your foo.mlpack file into a foo.odocl generating the documentation of the separate submodules. That's only an imperfect workaround as the doc will talk about X rather than Foo.X, but that's a least-effort solution.

Here's the solution I'm now using in my Makefile. It does work, and cross-references into the Support module work:
doc:
ocp-pack -o support.ml.tmp support/logging.ml support/common.ml support/utils.ml support/basedir.ml support/qdom.ml support/system.ml
echo '(** General support code; not 0install-specific *)' > support.ml
cat support.ml.tmp >> support.ml
rm support.ml.tmp
$(OCAMLBUILD) 0install.docdir/index.html
rm support.ml
It's hacky because:
You have to list the support.ml files in build order, by hand
The Makefile adds the doc comments for Support (otherwise, it takes the description of the first sub-module, which you don't want)

Moving from sourceCpp to a package w/Rcpp

I currently have a .cpp file that I can compile using sourceCpp(). As expected the corresponding R function is created and the code works as expected.
Here it is:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector exampleOne(NumericVector vectorOne, NumericVector vectorTwo){
NumericVector outputVector = vectorOne + vectorTwo;
return outputVector;
}
I am now converting my project over to a package using Rcpp. So I created the skeleton with rStudio and started looking at how to convert things over.
In Hadley's excellent primer on Cpp, he says in section "Using Rcpp in a Package":
If your packages uses the Rcpp::export attribute then one additional step in the package build process is requried. The compileAttributes function scans the source files within a package for Rcpp::export attributes and generates the code required to export the functions to R.
You should re-run compileAttributes whenever functions are added, removed, or have their signatures changed. Note that if you build your package using RStudio or devtools then this step occurs automatically.
So it looks like the code that compiled with sourceCpp() should work pretty much as is in a package.
I created the corresponding R file.
exampleOne <- function(vectorOne, vectorTwo){
outToR <- .Call("exampleOne", vectorOne, vectorTwo, PACKAGE ="testPackage")
outToR
}
Then I (re)built the package and I get this error:
Error in .Call("exampleOne", vectorOne, vectorTwo, PACKAGE = "voteR") :
C symbol name "exampleOne" not in DLL for package "testPackage"
Does anyone have an idea as to what else I need to do when taking code that compiles with sourceCpp() and then using it in a package?
I should note that I have read: "Writing a package that uses Rcpp" http://cran.rstudio.com/web/packages/Rcpp/vignettes/Rcpp-package.pdf and understand the basic structure presented there. However, after looking at the RcppExamples source code, it appears that the structure in the vignettes is not exactly the same as that used in the example package. For example there are no .h files used. Also neither the vignette nor the source code use the [[Rcpp::export]] attribute. This all makes it difficult to track down exactly where my error is.

Here is my "walk through" of how to go from using sourceCpp() to a package that uses Rcpp. If there is an error please feel free to edit this or let me know and I will edit it.
[NOTE: I HIGHLY recommend using RStudio for this process.]
So you have the sourceCpp() thing down pat and now you need to build a package. This is not hard, but can be a bit tricky, because the information out there about building packages with Rcpp ranges from the exhaustive thorough documentation you want with any R package (but that is above your head as a newbie), and the newbie sensitive introductions (that may leave out a detail you happen to need).
Here I use oneCpp.cpp and twoCpp.cpp as the names of two .cpp files you will use in your package.
Here is what I suggest:
A. First I assume you have a version of theCppFile.cpp that compiles with sourceCpp() and works as you expect it to. This is not a must, but if you are new to Rcpp OR packages, it is nice to make sure your code works in this simple situation before you move to the more complicated case below.
B. Now build your package using Rcpp.package.skeleton() or use the Project>Create Project>Package w/Rcpp wizard in RStudio (HIGHLY recommended). You can find details about using Rcpp.package.skeleton() in hadley/devtools or Rcpp Attributes Vignette. The full documentation for writing packages with Rcpp is in Writing a package that uses Rcpp, however this one assumes you know your way around C++ fairly well, and does not use the new "Attributes" way of doing Rcpp. It will be invaluable though if you move toward making more complex packages.
You should now have a directory structure for your package that looks something like this:
yourPackageName
- DESCRIPTION
- NAMESPACE
- \R\
- RcppExports.R
- Read-and-delete-me
- \man\
- yourPackageName-package.Rd
- \src\
- Makevars
- Makevars.win
- oneCpp.cpp
- twoCpp.cpp
- RcppExports.cpp
Once everything is set up, do a "Build & Reload" if using RStudio, or compileAttributes() if you are not in RStudio.
C. You should now see in your \R directory a file called RcppExports.R. Open it and check it out. In RcppExports.R you should see the R wrapper functions for all the .cpp files you have in your \src directory. Pretty sweet, eh?.
D) Try out the R function that corresponds to the function you wrote in theCppFile.cpp. Does it work? If so move on.
E) You can now just add new .cpp files like otherCpp.cpp to the \src directory as you create them. Then you just have to rebuild the package, and the R wrappers will be generated and added to RcppExports.R for you. In RStudio this is just "Build & Reload" in the Build menu. If you are not using RStudio you should run compileAttributes()

In short, the trick is to call compileAttributes() from within the root of the package. So for instance for package foo
$ cd /path/to/foo
$ ls
DESCRIPTION man NAMESPACE R src
$ R
R> compileAttributes()
This command will generate the RcppExports.cpp and RcppExports.R that were missing.

You are missing the forest for the trees.
sourceCpp() is a recent function; it is part of what we call "Rcpp attributes" which has its own vignette (with the same title, in the package, on my website, and on CRAN) which you may want to read. It among other things details how to turn something you compiled and run using sourceCpp() into a package. Which is what you want.
Randomly jumping between documentation won't help you, and at the end of the genuine source documentation by package authors may be preferable. Or to put a different spin on it: you are using a new feature but old documentation that doesn't reflect it. Try to write a basic package with Rcpp, ie come to it from the other end as well.
Lastly, there is a mailing list...

Expand macro inside doxygen comment for printing out software version

I have some C++ code base, documented with doxygen, and build with GNU make.
Version information is centralized in makefile, where I have something like:
VERSION=1.2.3.4
In my makefile, the CFLAGS add the following define:
CFLAGS += -DAPP_VERSION=$(VERSION)
This enables me to get the version in code, like this:
#define STR_EXPAND(tok) #tok
#define STR(tok) STR_EXPAND(tok)
int main()
{
cout << "software version is << STR(APP_VERSION) << endl;
}
Now, what I would like is to have this in the doxygen-produced html files:
Current version of software is 1.2.3.4
I managed to export the makefile variable into the doxygen configuration file with:
(edit: doxygen is called from makefile, through a 'make-doc' target)
PREDEFINED = APP_VERSION=$(VERSION)
But then, if I try in the doxygen \mainpage command something like this, it fails, because (of course), macro names don't get expanded in comments...
/**
\mainpage this is the doc
Current version is $(APP_VERSION) -- or -- ... is APP_VERSION
*/
Questions
Do you know of a way to "expand" that macro in the doxygen comments ? This could be done by some sed processing on the file holding the comment in the makefile, but maybe this can be solved directly with doxygen ?
How do other projects handle versioning (besides automatic versioning tool that VCS provide, I mean), in a way that the version id is uniquely defined in a file, so it can be fetched both by software build system and documentation build system.
Related: How to display a defined value

Macros in comments are not generally expanded (see, for example, this answer). This is not unique to doxygen and I can 't think of a way to do this using the PREDEFINED configuration option.
As you state in the question, you can use sed, see the third bullet point in this answer. For example, using the following
INPUT_FILTER = "sed -e 's/VERSION/1.0/'"
will replace all instances of VERSION with 1.0 in all your source files (you can specify which files to process with INPUT_FILTER, rather than processing all source files). You might not want VERSION to be expanded everywhere, so perhaps it is best to use something like $(VERSION) and sed this token. Also, you will need a way of getting your version number from your makefile and into your doxygen configuration file. This can be done with another sed.
To address your last bullet point, doxygen has the FILE_VERSION_FILTER configuration option for determining the version number of each file. Using this will print some version information (whatever is printed to standard out from the command specified in FILE_VERSION_FILTER) at the top of each file page. In the documentation there are examples of getting the version number using a number of different version control systems. Also, here is a page describing how to use git and doxygen to extract version information.
The only drawback with this configuration option is that I don't know how to specify where the file version information should appear in the final documentation. I presume you can use a layout file: I presume you can change the layout of pages, but I have never done this and don't know how easy it would be to use this to include version information on the mainpage.

You need to use the "export" functionality of make ie a very simple make file with
project_name=FooBar
export project_name
all:
doxygen Doxyfile
Will allow you to use the following comments in C++
/*! \mainpage Project $(project_name) Lorem ipsum dolor
I can see this becoming a PITA with a large set of exports but it's a fairly simple way to do it. Alternatively you could run doxygen from a separate BASH script with all the exports in it to avoid polluting your Makefile too much.

the commands manual suggests that $(VARIABLE) expands environment variables. So maybe you can put your version in an environment variable?

Using closure library with jsTestDriver

I'm learning about google closure tools by writing a simple JavaScript game. I'm having trouble figuring out how to set up jsTestDriver so that it works well with closure library.
Specifically: I'd like to use the goog.require mechanism to include any additional JavaScript files rather than have to manually add them all to the config file.
Following meyertee's suggestion I made a simple script to automatically write the dependencies to a config file
#!/bin/bash
cp tests/jsTestDriver.conf.proto tests/jsTestDriver.conf
libs/closure-library/closure/bin/build/closurebuilder.py --root="./libs/closure-library" --root="./js" --namespace="lds" | sed "s#^# - \.\./#" >> tests/jsTestDriver.conf
The tests/jsTestDriver.conf.proto file is a simple template:
test:
- "*.js"
load:
- ../libs/knockout-2.1.0.js
# Crucial, the load key needs to be last, and this comment must be followed by a newline.
It is a very fragile script, but hopefully someone (other than me) will find it useful.

You can do it semi-automatically by letting Closure Compile generate a manifest file, which will output all files in the correct order of dependency. You can then transform that file to relative paths and paste them into the JsTestDriver config file. That's how I do it.
You could even write a script that does this transformation automatically.
This is the relevant compiler argument:
--output_manifest manifest.MF
There are some details on the Closure Compiler's Google Code Wiki
Edit:
There are also some Python scripts to help you calculate dependencies. You can use calcdeps.py or closurebuilder.py to generate a manifest file, which even includes files that haven't been 'required' by your code.

Since JsTestDriver does not following the Closure Library convention of declaring dependencies with goog.provide() and goog.require(), your best option may be meyertee's solution.
However, the Closure Library includes its own testing framework. See:
Test Driven Development with the Closure Framework
Asserts API

Using Non-Local Data/Media Files with a C++ Application (gtkmm)

I'm beginning development on an acoustic spectrum analysis tool (inspired by spek) written in C++ with gtkmm (C++ bindings for the GTK+ GUI toolkit). I would imagine that I should know how to do this by now, however...
My directory structure is a-la-GNOME, e.g src/, data/, po/, man/. The specific situation that presented the need for my inquiry is the use of a GTK UI Manager that will be located in data/ui. For this specific situation, I want to be able to load the user-interface from this file in an install-independent manner (e.g. loading of the file does not depend on a make install; the executable may be run [and load the UI file] either from src/ after running make [thus compiling the sources into the selfsame exectuable] or from its install prefix). How would I refer to the UI file in my source code (keeping in mind that the loading of the file is not performed by creating a file object (fopen(...)) but rather by passing a file location as a string argument to (UIManager).add_ui_from_file(...))?
In addition to this particular situation of a UI file, how would I do similar references to files (i.e. databases, INI files, XML schemas) by using the autotools build process? Is there a piece of relevant Automake code to quickly set up a project to use this type of directory structure?

simply try to use both files (with the un-installed taking precedence):
if(!(UIManager).add_ui_from_file(../data/ui/mygui))
(UIManager).add_ui_from_file(/incalled/location/mygui)

In Glom, I created a helper function that tries both locations, with both locations being defined in the Makefile.am (this is simpler if you have only one Makefile.am, by using non-recursive automake, which is simpler anyway):
http://git.gnome.org/browse/glom/tree/glom/glade_utils.h#n38

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js