Passing command line arguments when profiling with llvm-prof

Passing command line arguments when profiling with llvm-prof - llvm

How can I pass command line arguments to my program when profiling with llvm-prof?
And where can I find a more comprehensive documentation for llvm-prof? "llvm-prof -help" output is too brief. Its manual is even shorter.

I would recommend staying away from llvm-prof at this point. The reason is that it was actually removed from trunk LLVM a month ago (in revision 191835). Here is the commit message that should clarify the motivation:
Remove the very substantial, largely unmaintained legacy PGO
infrastructure.
This was essentially work toward PGO based on a design that had several
flaws, partially dating from a time when LLVM had a different
architecture, and with an effort to modernize it abandoned without being
completed. Since then, it has bitrotted for several years further. The
result is nearly unusable, and isn't helping any of the modern PGO
efforts. Instead, it is getting in the way, adding confusion about PGO
in LLVM and distracting everyone with maintenance on essentially dead
code. Removing it paves the way for modern efforts around PGO.
Among other effects, this removes the last of the runtime libraries from
LLVM. Those are being developed in the separate 'compiler-rt' project
now, with somewhat different licensing specifically more approriate for
runtimes.

The way you write the question implies that you are trying to execute your program using llvm-prof. However I am not sure if that is the way to do it. The way to profile is first to instrument your code with counters using:
opt -disable-opt -insert-edge-profiling -o program.profile.bc program.bc
Then execute the instrumented program using lli as follows:
lli -O0 -fake-argv0 'program.bc < YOUR_ARGS' -load llvm/Debug+Asserts/lib/libprofile_rt.so program.profile.bc
Note the way to pass the arguments to the program using -fake-argv0 'program.bc < YOUR_ARGS' above. This step will generate llvmprof.out file which can then be read with llvm-prof to generate the execution profiles as follows:
llvm-prof program.profile.bc

Related

How to deliberately slow down compilation?

There are a lot of questions asking how to speed up compilation of C++ code. I need to do the opposite.
I'm working with a software that monitors compiler invocation in order to do static code analysis. But if compiler process is closed too quickly, monitoring software can miss it. So I need to slow compilation down. I understand that's a terrible solution and hope it will be temporary.
I came up with two solutions:
Disable parallel build, enable preprocessor and compiler listing generation. It works but requires a lot of mouse clicking
Use compiler option to force inclusion of special header file that somehow slows compilation.
Unfortunately I couldn't come up with something simple to write and hard to compile at the same time. Using a lot of #warning seems to work but obviously clutters the output significantly.
I'm using Keil with armcc compiler, so I can use most of C++11 but maximum template recursion depth is just 63.
Preferably this should not produce any overhead for binary size or running time.
UPD: I'll try to clarify this a bit. I know that's a horrible idea, I know that this problem should be solved differently. I will try to solve it differently but I also want to explore this possibility.

Maybe this solution will be slow enough =), something like #NathanOliver propose.
Its compile time table sine I use. It requires extra space, but you can tune it a little (table size and sine accuracy are template parameters of "staticSinus" function, hope you`ll find your best).
https://godbolt.org/z/DYZDF5

You don't want to do anything of the sort. Here are some solutions, of varying degree of kludginess:
Ideal solution: invoke the code analysis from the Makefile.
Replace the compiler with an e.g. Python script that forwards the command-line to the compiler, then triggers the analysis tool.
Monitor make instead of the compiler - it tends to live longer.
Have a tiny wrapper script maintain a reference count in shared memory, and when the reference count is initially incremented, the wrapper should go to sleep for "long enough" after the compiler has finished. Monitor that script.
In a nutshell: the monitoring tool shouldn't be monitoring anything. The code analysis should be invoked from the build tool, i.e. given in the Makefile. If generating the Makefile by hand is too cumbersome, use cmake with ninja, or xmake with no dependencies. You can also generate whatever "project" file the IDE needs to make working on the project easier. But make something else than Keil-specific stuff be the source of truth for the project: it'll make everything go easy from then on.

Writing Complete analysis using llvm-clang

I have following tasks that I need to do as part of my research idea:
1. Parse the C file(s) at hand to get llvm-IR.
2. Do analysis on the IR. Possibly add and remove some instructions or BB
3. Emit either x86 executable or C (need to decide later)
I think this is quite common task for any one writing analysis on C, I want to do all these tasks in C/C++ (as most of our research code is in C/C++). I googled a lot, although lot of documentation is available on Tasks 2 and 3, but less is available on Task 1, any idea on this would be really helpful.
I want to hook these tasks as a pipe line, any suggestion for this are also welcome.
-Thanks

(1) can be done by using Clang to emit LLVM IR.
(2) can be done by writing your own LLVM pass, and later invoking it (with any other passes you're interested in) via LLVM's opt tool.
(3) (to x86) can be done by LLVM's llc tool.
All of these are also accessible as APIs, not only command-line tools, making it possible to incorporate into your pipe.

Why is the LLVM execution engine faster than compiled code?

I have a compiler which targets LLVM, and I provide two ways to run the code:
Run it automatically. This mode compiles the code to LLVM and uses the ExecutionEngine JIT to compile it into machine code on-the-fly and run it without ever generating an output file.
Compile it and run separately. This mode outputs an LLVM .bc file, which I manually optimise (with opt), compile to native assembly (with llc) compile to machine code and link (with gcc), and run.
I was expecting approach #2 to be faster than approach #1, or at least the same speed, but running a few speed tests, I am surprised to find that #2 consistently runs about twice as slow. That is a huge speed difference.
Both cases are running the same LLVM source code. With approach #1, I haven't yet bothered to run any LLVM optimisation passes (which is why I was expecting it to be slower). With approach #2, I am running opt with -std-compile-opts and llc with -O3, to maximise optimisation, yet it isn't getting anywhere near #1. Here is an example run of the same program:
#1 without optimisation: 11.833s
#2 without optimisation: 22.262s
#2 with optimisation (-std-compile-opts and -O3): 18.823s
Is the ExecutionEngine doing something special that I don't know about? Is there any way for me to optimise the compiled code to achieve the same performance as the ExecutionEngine JIT?

It is normal for a VM with JIT to run some applications faster than than a compiled application. That's because a VM with JIT is like a simulator that simulates a virtual computer, and also runs a compiler in realtime. Because both tasks are built into the VM with JIT, the machine simulator can feed information to the compiler so that the code can be recompiled to run more efficiently. The information that it provides is not available to statically compiled code.
This effect has also been noted with Java VMs and with Python's PyPy VM, among others.

Another issue is aligning code and other optimizations. Nowadays cpu's are so complex that it's hard to predict which techniques will result in faster execution of final binary.
As an real-life example, let's consider Google's Native Client - I mean original nacl compilation approach, not involing LLVM (cause, as far as I know, currently there is direction on supporting both "nativeclient" and "LLVM bitcode"(modyfied) code).
As you can see on presentations (check out youtube.com) or in papers, like this Native Client: A Sandbox for Portable, Untrusted x86 Native Code, even their aligning technique makes code size bigger, in some cases such aligning of instructions (for example with noops) gives better cache hitting.
Aligning instructions with noops and instruction reordering it known in parallel computing, and here it shows it's impact as well.
I hope this answer gives an idea how much circumstances might influence on code speed execution, and that are many possible reasons for different pieces of code, and each of them needs investigation. Nevermore, it's interesting topic, so If you find some more details, don't hestitate to reedit your answer and let us know in "Post-Scriptorium", what have you found more :). (Maybe link to whitepaper/devblog with new findings :) ). Benchmarks are always welcome - take a look : http://llvm.org/OpenProjects.html#benchmark .

Edit and Continue on GDB

I know that E&C is a controversial subject and some say that it encourages a wrong approach to debugging, but still - I think we can agree that there are numerous cases when it is clearly useful - experimenting with different values of some constants, redesigning GUI parameters on-the-fly to find a good look... You name it.
My question is: Are we ever going to have E&C on GDB? I understand that it is a platform-specific feature and needs some serious cooperation with the compiler, the debugger and the OS (MSVC has this one easy as the compiler and debugger always come in one package), but... It still should be doable. I've even heard something about Apple having it implemented in their version of GCC [citation needed]. And I'd say it is indeed feasible.
Knowing all the hype about MSVC's E&C (my experience says it's the first thing MSVC users mention when asked "why not switch to Eclipse and gcc/gdb"), I'm seriously surprised that after quite some years GCC/GDB still doesn't have such feature. Are there any good reasons for that? Is someone working on it as we speak?

It is a surprisingly non-trivial amount of work, encompassing many design decisions and feature tradeoffs. Consider: you are debugging. The debugee is suspended. Its image in memory contains the object code of the source, and the binary layout of objects, the heap, the stacks. The debugger is inspecting its memory image. It has loaded debug information about the symbols, types, address mappings, pc (ip) to source correspondences. It displays the call stack, data values.
Now you want to allow a particular set of possible edits to the code and/or data, without stopping the debuggee and restarting. The simplest might be to change one line of code to another. Perhaps you recompile that file or just that function or just that line. Now you have to patch the debuggee image to execute that new line of code the next time you step over it or otherwise run through it. How does that work under the hood? What happens if the code is larger than the line of code it replaced? How does it interact with compiler optimizations? Perhaps you can only do this on a specially compiled for EnC debugging target. Perhaps you will constrain possible sites it is legal to EnC. Consider: what happens if you edit a line of code in a function suspended down in the call stack. When the code returns there does it run the original version of the function or the version with your line changed? If the original version, where does that source come from?
Can you add or remove locals? What does that do to the call stack of suspended frames? Of the current function?
Can you change function signatures? Add fields to / remove fields from objects? What about existing instances? What about pending destructors or finalizers? Etc.
There are many, many functionality details to attend to to make any kind of usuable EnC work. Then there are many cross-tools integration issues necessary to provide the infrastructure to power EnC. In particular, it helps to have some kind of repository of debug information that can make available the before- and after-edit debug information and object code to the debugger. For C++, the incrementally updatable debug information in PDBs helps. Incremental linking may help too.
Looking from the MS ecosystem over into the GCC ecosystem, it is easy to imagine the complexity and integration issues across GDB/GCC/binutils, the myriad of targets, some needed EnC specific target abstractions, and the "nice to have but inessential" nature of EnC, are why it has not appeared yet in GDB/GCC.
Happy hacking!
(p.s. It is instructive and inspiring to look at what the Smalltalk-80 interactive programming environment could do. In St80 there was no concept of "restart" -- the image and its object memory were always live, if you edited any aspect of a class you still had to keep running. In such environments object versioning was not a hypothetical.)

I'm not familiar with MSVC's E&C, but GDB has some of the things you've mentioned:
http://sourceware.org/gdb/current/onlinedocs/gdb/Altering.html#Altering
17. Altering Execution
Once you think you have found an error in your program, you might want to find out for certain whether correcting the apparent error would lead to correct results in the rest of the run. You can find the answer by experiment, using the gdb features for altering execution of the program.
For example, you can store new values into variables or memory locations, give your program a signal, restart it at a different address, or even return prematurely from a function.
Assignment: Assignment to variables
Jumping: Continuing at a different address
Signaling: Giving your program a signal
Returning: Returning from a function
Calling: Calling your program's functions
Patching: Patching your program
Compiling and Injecting Code: Compiling and injecting code in GDB

This is a pretty good reference to the old Apple implementation of "fix and continue". It also references other working implementations.
http://sources.redhat.com/ml/gdb/2003-06/msg00500.html
Here is a snippet:
Fix and continue is a feature implemented by many other debuggers,
which we added to our gdb for this release. Sun Workshop, SGI ProDev
WorkShop, Microsoft's Visual Studio, HP's wdb, and Sun's Hotspot Java
VM all provide this feature in one way or another. I based our
implementation on the HP wdb Fix and Continue feature, which they
added a few years back. Although my final implementation follows the
general outlines of the approach they took, there is almost no shared
code between them. Some of this is because of the architectual
differences (both the processor and the ABI), but even more of it is
due to implementation design differences.
Note that this capability may have been removed in a later version of their toolchain.
UPDATE: Dec-21-2012
There is a GDB Roadmap PDF presentation that includes a slide describing "Fix and Continue" among other bullet points. The presentation is dated July-9-2012 so maybe there is hope to have this added at some point. The presentation was part of the GNU Tools Cauldron 2012.
Also, I get it that adding E&C to GDB or anywhere in Linux land is a tough chore with all the different components.
But I don't see E&C as controversial. I remember using it in VB5 and VB6 and it was probably there before that. Also it's been in Office VBA since way back. And it's been in Visual Studio since VS2005. VS2003 was the only one that didn't have it and I remember devs howling about it. They intended to add it back anyway and they did with VS2005 and it's been there since. It works with C#, VB, and also C and C++. It's been in MS core tools for 20+ years, almost continuous (counting VB when it was standalone), and subtracting VS2003. But you could still say they had it in Office VBA during the VS2003 period ;)
And Jetbrains recently added it too their C# tool Rider. They bragged about it (rightly so imo) in their Rider blog.

Dead code detection in legacy C/C++ project [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
How would you go about dead code detection in C/C++ code? I have a pretty large code base to work with and at least 10-15% is dead code. Is there any Unix based tool to identify this areas? Some pieces of code still use a lot of preprocessor, can automated process handle that?

You could use a code coverage analysis tool for this and look for unused spots in your code.
A popular tool for the gcc toolchain is gcov, together with the graphical frontend lcov (http://ltp.sourceforge.net/coverage/lcov.php).
If you use gcc, you can compile with gcov support, which is enabled by the '--coverage' flag. Next, run your application or run your test suite with this gcov enabled build.
Basically gcc will emit some extra files during compilation and the application will also emit some coverage data while running. You have to collect all of these (.gcdo and .gcda files). I'm not going in full detail here, but you probably need to set two environment variables to collect the coverage data in a sane way: GCOV_PREFIX and GCOV_PREFIX_STRIP...
After the run, you can put all the coverage data together and run it through the lcov toolsuite. Merging of all the coverage files from different test runs is also possible, albeit a bit involved.
Anyhow, you end up with a nice set of webpages showing some coverage information, pointing out the pieces of code that have no coverage and hence, were not used.
Off course, you need to double check if the portions of code are not used in any situation and a lot depends on how good your tests exercise the codebase. But at least, this will give an idea about possible dead-code candidates...

Compile it under gcc with -Wunreachable-code.
I think that the more recent the version, the better results you'll get, but I may be wrong in my impression that it's something they've been actively working on. Note that this does flow analysis, but I don't believe it tells you about "code" which is already dead by the time it leaves the preprocessor, because that's never parsed by the compiler. It also won't detect e.g. exported functions which are never called, or special case handling code which just so happen to be impossible because nothing ever calls the function with that parameter - you need code coverage for that (and run the functional tests, not the unit tests. Unit tests are supposed to have 100% code coverage, and hence execute code paths which are 'dead' as far as the application is concerned). Still, with these limitations in mind it's an easy way to get started finding the most completely bollixed routines in the code base.
This CERT advisory lists some other tools for static dead code detection

For C code only and assuming that the source code of the whole project
is available, launch an analysis with the Open Source tool Frama-C.
Any statement of the program that displays red in the GUI is
dead code.
If you have "dead code" problems, you may also be interested in
removing "spare code", code that is executed but does not
contribute to the end result. This requires you to provide
an accurate modelization of I/O functions (you wouldn't want
to remove a computation that appears to be "spare" but
that is used as an argument to printf). Frama-C has an option for pointing out spare code.

Your approach depends on the availability (automated) tests. If you have a test suite that you trust to cover a sufficient amount of functionality, you can use a coverage analysis, as previous answers already suggested.
If you are not so fortunate, you might want to look into source code analysis tools like SciTools' Understand that can help you analyse your code using a lot of built in analysis reports. My experience with that tool dates from 2 years ago, so I can't give you much detail, but what I do remember is that they had an impressive support with very fast turnaround times of bug fixes and answers to questions.
I found a page on static source code analysis that lists many other tools as well.
If that doesn't help you sufficiently either, and you're specifically interested in finding out the preprocessor-related dead code, I would recommend you post some more details about the code. For example, if it is mostly related to various combinations of #ifdef settings you could write scripts to determine the (combinations of) settings and find out which combinations are never actually built, etc.

Both Mozilla and Open Office have home-grown solutions.

g++ 4.01 -Wunreachable-code warns about code that is unreachable within a function, but does not warn about unused functions.
int foo() {
return 21; // point a
}
int bar() {
int a = 7;
return a;
a += 9; // point b
return a;
}
int main(int, char **) {
return bar();
}
g++ 4.01 will issue a warning about point b, but say nothing about foo() (point a) even though it is unreachable in this file. This behavior is correct although disappointing, because a compiler cannot know that function foo() is not declared extern in some other compilation unit and invoked from there; only a linker can be sure.

Dead code analysis like this requires a global analysis of your entire project. You can't get this information by analyzing translation units individually (well, you can detect dead entities if they are entirely within a single translation unit, but I don't think that's what you are really looking for).
We've used our DMS Software Reengineering Toolkit to implement exactly this for Java code, by parsing all the compilation-units involved at once, building symbol tables for everything and chasing down all the references. A top level definition with no references and no claim of being an external API item is dead. This tool also automatically strips out the dead code, and at the end you can choose what you want: the report of dead entities, or the code stripped of those entities.
DMS also parses C++ in a variety of dialects (EDIT Feb 2014: including MS and GCC versions of C++14 [EDIT Nov 2017: now C++17]) and builds all the necessary symbol tables. Tracking down the dead references would be straightforward from that point. DMS could also be used to strip them out. See http://www.semanticdesigns.com/Products/DMS/DMSToolkit.html

Bullseye coverage tool would help. It is not free though.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js