How to profile Rcpp code (on linux) - c++

I made an R package with Rcpp where whole simulations are run in c++ and results are analyzed in R. Now I need to profile my functions so I can optimize them, but R profilers can't distinguish what happens inside the C++ functions, and I don't know how to run C++ profilers when the functions can only be ran from inside R.
So far, I have found some suggestions to use gperftools (questions and tutorials) but the guides are incomplete (maybe they assume a level of knowledge that I lack?), have missing links, and I keep running into walls. Hence this question. Here's where I'm at:
Install gperftools (I installed from extra/gperftools with pacman)
include gperftools/profiler.h on the C++ header
Add ProfilerStart("myprof.log") and ProfilerStop() in the C++ code around what I want to profile
Compile with -lprofiler
Run "$ CPUPROFILE="myprof.log" R -f myscript.R"
The current wall is gcc tells me "Undefined Symbol: ProfilerStart", so I think there's something wrong with the linking?

I'm not really very impressed with gperftools. Also, it appears to be an instrumenting profiler, sampling-based profilers are easier to use and are likely to run faster. Intels VTune is an excellent sampling-based profiler, available for free if you're an educational user. Even if you're not, your organisation may already have licenses.
Turning to your gperftools issue, yes, that's a linker issue. As you have decided not to share any of the relevant information (link command? compile command? Actual error messages?) we can't help you further.

It was a linking error after all, caused by my lack of experience as this is the first time I use Makevars.
In step #4, I added "-lprofiler" to PKG_CXXFLAGS, that is used in compiling, when I should have added it to PKG_LIBS. I made the change and now the profiler works just fine. This is my Makevars now:
PKG_CXXFLAGS += -Wall -pedantic -g -ggdb #-fno-inline-small-functions
PKG_LIBS += -lprofiler
CXX_STD = CXX11

Related

Are there any downsides to compiling with -g flag?

GDB documentation tells me that in order to compile for debugging, I need to ask my compiler to generate debugging symbols. This is done by specifying a '-g' flag.
Furthermore, GDB doc recommends I'd always compile with a '-g' flag. This sounds good, and I'd like to do that.
But first, I'd like to find out about downsides. Are there any penalties involved with compiling-for-debugging in production code?
I am mostly interested in:
GCC as the compiler of choice
Red hat Linux as target OS
C and C++ languages
(Although information about other environments is welcome as well)
Many thanks!
If you use -g (which on recent GCC or Clang can be used with optimization flags like -O2):
compilation time is slower (and linking will use a lot more memory)
the executable is a bigger file (see elf(5) and use readelf(1)...)
the executable carries a lot of information about your source code.
you can use GDB easily
some interesting libraries, like Ian Taylor's libbacktrace, requires DWARF information (e.g. -g)
If you don't use -g it would be harder to use the GDB debugger (but possible).
So if you transmit the binary executable to a partner that should not understand how your source code was written, you need to avoid -g
See also the strip(1) and strace(1) commands.
Notice that using the -g flag for debugging information is also valid for Ocaml, Rust
PS. Recent GCC (e.g. GCC 10 or GCC 11 in 2021) accept many debugger flags. With -g3 your executable carries more debug information (e.g. description of C++ macros and their expansion) that with -g or -g1. Of course, compilation time increases, and executable size also. In principle, your GCC plugin (perhaps Bismon in 2021, or those inside the source code of the Linux kernel) could add even more debug information. In practice, you won't do that unless you can improve your debugger. However, a GCC plugin (or some #pragmas) can remove some debug information (e.g. remove debug information for a selected set of functions).
Generally, adding debug information increases the size of the binary files (or creates extra files for the debug information). That's nowadays usually not a problem, unless you're distributing it over slow networks. And of course this debug information may help others in analyzing your code, if they want to do that. Typically, the -g flag is used together with -O0 (the default), which disables compiler optimization and generates code that is as close as possible to the source, so that debugging is easier. While you can use debug information together with optimizations enabled, this is really tricky, because variables may not exist, or the sequence of instructions may be different than in the source. This is generally only done if an error needs to be analyzed that only happens after the optimizations are enabled. Of course, the downside of -O0 is poorer performance.
So as a conclusion: Typically one uses -g -O0 during development, and for distribution or production code just -O3.

I cannot compile this simple C++ program involving gd.h

This isn't my code, I am not a programmer but I did not expect simply compiling a provided source code would be so difficult.
Here it is, taken from Joel Yliluoma's page about "arbitrary-palette positional dithering algorithm", it was written in 2011.
This was my troubleshooting process, using MinGW:
The code didn't seem to make sense at all, so I realized it was written in an earlier version of C++, and added -std=c++98.
It couldn't find gd.h, I downloaded that from libgd's website, and directed to its directory using -I.
A bunch of gd related commands got a "undefined reference to" treatment. I tried to direct the compiler to gd.h/gd.c directory again using -l and followed by -lgd. And this is where I got stuck, as
The compiler insisted on not being able to find -lgd. I tried with different versions of libgd (especially older ones, before 2011) and sometimes it'd find what it's looking for, but then skip over them as they are incompatible.
I've also tried to compile it with another program called Dev-C++ but to no avail. Dev-C++ also gave back a "linker error". I can only assume that I messed up linking the header or library somehow, but I do not know what those terms mean frankly and just wanted a working program so I can get back to my imagery stuff. Maybe I downloaded the wrong gd.h, or I'm missing a required thing. Any help would be greatly appreciated.
Here's my current final MinGW input:
g++ -std=c++98 -Ipath\to\libgd code.cpp -Lpath\to\libgd -lgd -o executable.exe
I can assure you that path\to\libgd contains gd.h (and a bunch of other gd related stuff) and either one of these depending on which version of libgd I found: libgd.lib, libgd.dll.a, lidgb.def, libgd.rc, libgd.so.
I'm using Windows 7 64-bit.

Nonsense results from gprof

I've been trying to profile some C++ with gprof 2.25.2 (under Cygwin) and it is reporting that 10% of the time is being spent in a function which I know is not being called. (I put a print statement into the relevant function to verify this.) It also seems to think that this function is calling itself recursively (number of calls is 500+16636500), which it definitely isn't.
It's a large enough program that I don't have an easy way of producing a minimal working example I can post here, but if anyone has any ideas about what might be causing this, I would be grateful to know.
Edit: building with CMake + g++. CMAKE_BUILD_TYPE=RELWITHDEBINFO.
I'll assume you're using gcc/g++...
This sounds like a case of the debug symbols being out-of-date with respect to your source code or executable. Try cleaning your build space, recompiling (with -g or -ggdb3, of course). If you're compiling with optimizations and you can afford to turn them off (i.e. -O0 instead of -O1, -O2 or -O3), do so for this run. If that works, try -O1 or -O2 and see what happens.

Getting Started with Makefile for C++(CMake or GNUMake?)

I have got my first project for this semester and I have been asked to submit it with a makefile. The literature available on the internet is a bit overwhelming and combined with my laziness, I came to stackoverflow for simple answers. I have found this answer by Brendan Long as a good place to start with.
The example he gives is:
all: a3driver.o
g++ -o a3driver a3driver.o
a3driver.o: a3driver.cpp
g++ -c a3driver.cpp
which i understand. This looks exactly like the make files I have seen on a Unix system and which i used to compile c++ files(only used, did not need to understand).
Then i search further and an answer to this question suggests using CMake which is completely different from the code I have pasted above.
So my question at this stage is which direction should i take? Should I learn about the CMake or the GNUMake? I only intend to work on C++ files for now.
Only you can answer this question because it depends heavily on your needs. Cmake is a "build control file generator", not a build control program. It doesn't actually build code: instead it will create a makefile, or a Visual Studio / Xcode / Eclipse project file, etc. You then use that build program (make, Visual Studio, XCode, Eclipse) to actually build the code.
Cmake is good if you need to support all those different types of builds across all those different architectures using their native build environments. If you're happy to use make on whichever architecture you need to build on (GNU make runs on all of those as well and all those IDEs except possibly Visual Studio have good integration with native make) then using make directly is fine. GNU make has lots of advanced features which make it very flexible.
I don't really agree with esseks assessment of the autotools although I know it's a very common opinion. Also note that automake itself does not use unusual, verbose syntax: automake files are just makefiles. However they have to be processed, and autoconf is how that's done... autoconf is more obscure although not as bad as people make it out to be, depending on your needs. This isn't the place for that discussion however.
I personally find cmake format even more annoying and strange than autotools, and it doesn't meet my needs in many ways (for example it's support for cross-compilation is not as good as autotools'). However I have to say its ability to generate native project files is really excellent--if you need it.
If you need a really really dead simple makefile for compiling one or few files only, then you are done with:
compile:
g++ myprogram.cpp -o myprogram
(note that lines must be indented with tab, not spaces).
If you need flexibility, you are on the right path with CMake. I suggest you to explore CMake, starting from their good tutorial or a simple example -- as the basics are simpler to undestand from code rather than learn from manual.
My personal opionion is to avoid GNU Automake (colloquially known as Autohell) because of the unusual, verbose syntax that sometimes scares beginners and tricks more experienced users.
EDIT: CMake is not used to compile, rather, it can generate makefiles for you, starting from a synthetic description of the project (where are the files to be compiled? What libraries are required? etc.). And it does this by checking for libraries, identifying compiler and carrying out other sanity check you would need to code by yourself otherwise.

C and C++ programming on Ubuntu 11.10 [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 9 years ago.
I've recently installed Ubuntu 11.10 and along with it the CodeBlocks IDE and I am aware that I have gcc and the std libraries by default.
My questions are:
Do you you have any tips for a new C++ programmer on Ubuntu?
Any libraries I should get from the start?
A really good IDE I'm missing? (YMMV but I prefer to work in IDE's)
Any programming boons or traps I should be aware of from the start?
You don't need an IDE to code in C or C++ on Ubuntu. You can use a good editor (like emacs, which you can configure to suit your needs.).
Some few tips for a newbie:
Always compile with -Wall -Wextra and perhaps even with -Werror -pedantic-errors
Order of arguments to the compiler (gcc or g++) are really important; I recommend:
general warnings and optimization flags (e.g. -Wall, -g to get debug info, -O, -flto etc, or -c to avoid linking , ...)
preprocessor options like -I include-dir and -D defined-symbol (or -H to understand which headers get included) etc..
source file[s] to compile like hello.c or world.cc
if you want to link existing object files else.o, add them after the source files
linker options (if relevant), notably -L library-dir (and probably -rdynamic if your program uses plugins with dlopen(3) ....)
libraries (like -lfoo -lbar from higher-level libraries like libfoo.so to lower-level libraries.
output file (i.e. produced executable), e.g. -o yourexec.
Always correct your source code till you got no warning at all. Trust the compiler's warnings and error messages.
Learn how to use make and to write simple Makefile-s; see this example.
there are other builders, e.g. http://omake.metaprl.org/ etc
Compile your code with the -g flag to have the compiler produce debugging information; only when you have debugged your program, ask the compiler to optimize (e.g. with -O1 or -O2), especially before benchmarking.
Learn how to use gdb
Use a version control system like svn or git (even for a homework assignment). In 2015 I recommend git over svn
Backup your work.
Learn to use valgrind to hunt memory leaks.
NB
The advices above are not specific to Ubuntu 11.10, they could apply to other Linux distributions and other Ubuntu versions.
QT Creator is a good IDE, that works well also with simple Makefile based projects. Also, as a C++ programmer you should check out Dia and Dia2Code for automatic generation of stubs from UML diagrams.
Since you ask more than one question I will answer each separately.
Do you you have any tips for a new C++ programmer on Ubuntu?
Learn some build system such as CMake or SCons. Although understanding how make and Makefiles work is useful there is a tendency of moving away from make to more high-level tools which also provide configure-like functionality. Make is often used for command-line build, for example with CMake you can generate Makefiles and build your projects using make.
Use a version control system such as git or Mercurial. I also recommend keeping those your projects you care about on some external service like github at least for the purposes of backup.
Pay attention to compiler warnings but keep in mind that warnings only catch a fraction of possible errors. A more complete picture can be obtained using static analysis tools and dynamic analysis tools like Valgrind.
Any libraries I should get from the start?
You've already got the main one which is called the C++ Standard Library. Make sure that you know what it provides.
Boost will cover most of the remaining needs except GUI.
Gtkmm and Qt are two major C++ GUI frameworks.
A really good IDE I'm missing? (YMMV but I prefer to work in IDE's)
Eclipse - for a long time I've been thinking of it as a Java only IDE, but in fact it is an excellent IDE for almost anything (I even wrote my PhD thesis in it using TeXlipse plugin) and C/C++ support is improving all the time. Also CMake can generate Eclipse CDT project files.
Qt Creator - another excellent C++ IDE. It is very fast and has native CMake support
Any programming boons or traps I should be aware of from the start?
From my experience the most common sources of errors in C++ are pointers and resource management in case of exceptions. Make sure you understand and use the RAII idiom and smart pointers.
For a more complete list of traps and recommendations see the answers to this question.
Some tips besides those which are already mentioned:
Valgrind is your friend in finding memory leaks. You may also use valgrind --tool=callgrind and KCacheGrind to see where does your program spend time on execution.
If you are going to distribute your program, you should learn autotools or cmake. The first is a classical tool, a bit bloated, the second is more modern.
Geany is a nice IDE if you are looking for something lightweight. Otherwise, take a look at Code::Blocks, Eclipse/CDT and NetBeans.
Since I am not sure what you meant by "std libraries", I should mention that besides standard C library, there are a lot of POSIX functions and interfaces, which are common to most *nix-systems, including Mac OS X.
Eclipse/CDT runs really well on Ubuntu.
Boost provide a whole bunch of libraries that are commonly used and can come in handy. Anyway, I'm not really sure this questions fits in too well on a Q&A board.
EDIT: As suggested by Basile, Makefiles and learning to use gdb are great ideas. There are plenty of neat flags to use with gcc also, for helping to debug your code, optimize it, produce assembly instructions, etc.
On the first steps of programming you should not use IDE because you will better understand what happens backside :) GCC or G++ and stdlib will be sufficient. You also should read about Makefiles, SVN(CVS, GIT), Autotools or CMake to manage your projects. If you want make GUI applications you should learn GTK+ or Qt. If you want real IDE for your needs try Eclipse with C/C++ plugins. Good luck :)
If you are familiar with the command line you can use an editor like vim and gcc/g++ to compile your code, learning make svn git is also recommend.
In case you are not familiar with the command line or you prefer using the UI :NetBeans is also a good IDE you can use to develop in c/c++ and java.
To install netbeans: open firefox and point to apt://netbeans
I hope this will help you.
I think Netbeans is good. Same UI in Microsoft Windows and Linux. Built-in version Controller and installed Git by default.
No extra library added (as oposit of QT)
Library: I recommend you to use Boost. You can find many libraries in it.
IDE: Eclipse and QTCreator are good IDEs, but I think it is also very important to use text editor + makefile. Vim, Emacs or Sublime Text is good choice.
Always remember to backup your code.