Static C++ Api coverage tool - c++

Given a set of public headers, and various test code that makes use of these headers, I need to generate a list of used/unused API calls.
I am working with a platform that can not easily have traditional code coverage at runtime, but my requirements are a bit simpler hopefully.
I only need this to occur statically, and it seems as if this should be an easily accomplished thing (Most IDE's show all available function calls). I haven't found an appropriate tool for this though.
Can anyone recommend one? Or point me to the specific term for what I am looking for?
Thank you

Related

Is there a way to get a call graph for certain c++ function in Visual Studio?

I wonder whether there is a tool for VS that can show me a call graph (that is, a diagram listing all possible execution paths) for a given C++ function. It would help in navigating a big code base, in cases where a function is called in only a few places.
For oft-called functions like printf it could simply say:
too many options...
Again I guess it is not really easy to make such tool so I wonder if it exists, but you know it seems possible to do it so you never know... :)
EDIT: I know about find all references, but that gives just call sites of the function, not the call site of the function that called the function that called the function...
EDIT: VS is 2010, but if necessary VS2012 is an option.
You mentioned that you know about finding all the references. Have you looked into viewing the Call Hierarchy? It's probably not your "dream method" but it does allow you to look at a function in terms of "calls to" and "calls from" the given function. The window also allows you to add multiple functions to view in a tree format. So basically you would tree up or down through the possible outcomes.
Right click on the desired method ( could be anywhere in the hierarchy ) =>
Select "View Call Hierarchy"
Note that if you can add more than one reference point to the window. Delete when needed
You could also use Ctrl+K or Ctrl+T
Another fine example, IMHO, of a disappointment in the differences between C++ and C# with VS. I think Code Maps would be just what you're looking for. Assuming of course you were working with Ultimate - but nope, not with C++.
There's no such feature in C++/MSVC, as far as I know.
However, there's AQTime profiler for windows that has "static analysis" option that (IF I remember correctly) scans compiled executable, generates call graph and shows you unreacheable functions.
If I remember correctly, AQtime integrates into visual studio (professional edition, afaik).
Unfortunately, this is a commercial profiler that costs around $500, and this feature is not available in trial version. Last time I used static analysis was 3..4 years ago and I don't exactly remember details at the moment (and I don't have access to AQTime anymore). Anyway, it is a specialized tool, so I wouldn't recommend buying it unless you're optimizing code for speed 24/7.
Perhaps, by googling "static analysis", "code coverage" or researching other profilers you'll find somewhat similar tool that does the job for free.
Aside from that, doxygen can generate callgraphs for C++ code. In case of doxygen, you'll have to hunt for functions that are never called yourself.
Also, Visual Studio 2008 had a built-in caller graph feature (which, I think, uses intellisense). Basically, you right click any function and select "show callers" (or something like that), that'll open list of all functions (visual studio THINKS are calling your function) in a window. Because this feature was present in VS2008, it should be included in VS2010. However, it can't detect every caller for obvious reasons (virtual methods, callbacks, etc).
Maybe doxygen is the tool you are looking for. It provides the possibility to generate call graphs (showing all functions called by a specific function) and/or caller graphs(showing the functions that the function is directly or indirectly called by).
see: http://www.doxygen.nl/manual/diagrams.html
Take a look at Understand tool (http://www.scitools.com). It's great for drawing call graphs and control flow charts.Unfortunately, it's commercial.
You can resolve results after doing Symbol search. Just right click in your source and then select find all references that performs symbol search. Its explained in further details at http://blogs.msdn.com/b/vcblog/archive/2009/11/17/improvements-to-find-all-references-in-visual-studio-2010.aspx
You can try CppDepend which give you the call graph inside VS and provides many features in its dependency graph.
Source Navigator is a tool that I have used and have been quite happy with on C++ projects. Again, it is not within the Visual Studio IDE, but it has some great advantages if you don't mind pressing Alt-Tab :-)
works with both C and C++ sources
is quite fast in it's indexing and searching; it's a pleasure to use, IMHO
is a visual tool
is a free and Open Source tool
See http://sourcenav.berlios.de/screenshots/ for some screenshots
In particular, you are looking for the Cross-Reference Browser:
"It can find every call of a function, or tell you everything a
particular function calls. It creates tree diagrams that show
essential relationships within the project's symbol database, such as
the function call hierarchy tree. You can traverse up and down the
hierarchy tree, as well as expand or restrict the tree. You can select
items in the hierarchy and display their Refers-to and Referred-by
relationships; these relationships are based on the "point-of-view" of
the selected symbol."
Though this example screenshot from the tutorial, "Using the Cross-Reference Browser" shows Referred-by relationships (using red arrows) for a class and not a function, the latter use case would be very similar. You can also browse what functions / methods are getting called from a function, and that would be a Refers-to relationship, shown using blue arrows instead of red.
Do give it a try! As I mentioned before, I have been a happy user of this tool; it's not very well-known, but is a good piece of software (that also stands as an example for how useful Tcl/Tk can be in the right hands).
I think you should be able to use VS Plugin - CodeGraph on your solution and look for the specific function you are looking for and go on from there. It does static analysis on your solution and generates a nice graph of the call flows. Check "https://marketplace.visualstudio.com/items?itemName=YaobinOuyang.CodeAtlas". Hope this helps.

A higher level tool for writing a pass in LLVM

Normally If you want to modify LLVM IR, you need to write a pass. However, writing a pass by yourself is an overkill sometimes if a higher level tool could facilitate you.
For example, someone might wish to log every load and store in the program. For that purpose, he would need to inject code that does the logging. Now if there is a higher level tool, it can provide callbacks to us to write what we want. So in this case, for example, it could provide us OnLoad and OnStore functions which we can fill to tell the tool what to do on each load and store. Does such kind of a tool exist?
So basically I want something similar to what is provided by Dynamic Binary Instrumentation tools but that works with LLVM, for compile time code injection.
I think you should consider using PIN instead of LLVM for such things: http://www.pintool.org/
PIN enables you insert instrumentation/analyze code at several granularity levels: instruction, basic block, function, traces and even load/unload of shared libraries. Is may be a way more practical since you won't need to compile the application - so you can analyze programs wich aren't open source for example.
There are version of PIN for windows and linux.
PS: Another tool that seems useful: http://eces.colorado.edu/~blomsted/llvmpin/llvmpin.html

Open-source C++ scanning library

Rationale: In my day-to-day C++ code development, I frequently need to
answer basic questions such as who calls what in a very large C++ code
base that is frequently changing. But, I also need to have some
automated way to exactly identify what the code is doing around a
particular area of code. "grep" tools such as Cscope are useful (and
I use them heavily already), but are not C++-language-aware: They
don't give any way to identify the types and kinds of lexical
environment of a given use of a type or function a such way that is
conducive to automation (even if said automation is limited to
"read-only" operations such as code browsing and navigation, but I'm
asking for much more than that below).
Question: Does there exist already an open-source C/C++-based library
(native, not managed, not Microsoft- or Linux-specific) that can
statically scan or analyze a large tree of C++ code, and can produce
result sets that answer detailed questions such as:
What functions are called by some supplied function?
What functions make use of this supplied type?
Ditto the above questions if C++ classes or class templates are involved.
The result set should provide some sort of "handle". I should be able
to feed that handle back to the library to perform the following types
of introspection:
What is the byte offset into the file where the reference was made?
What is the reference into the abstract syntax tree (AST) of that
reference, so that I can inspect surrounding code constructs? And
each AST entity would also have file path, byte-offset, and
type-info data associated with it, so that I could recursively walk
up the graph of callers or referrers to do useful operations.
The answer should meet the following requirements:
API: The API exposed must be one of the following:
C or C++ and probably is "C handle" or C++-class-instance-based
(and if it is, must be generic C o C++ code and not Microsoft- or
Linux-specific code constructs unless it is to meet specifics of
the given platform), or
Command-line standard input and standard output based.
C++ aware: Is not limited to C code, but understands C++ language
constructs in minute detail including awareness of inter-class
inheritance relationships and C++ templates.
Fast: Should scan large code bases significantly faster than
compiling the entire code base from scratch. This probably needs to
be relaxed, but only if Incremental result retrieval and Resilient
to small code changes requirements are fully met below.
Provide Result counts: I should be able to ask "How many results
would you provide to some request (and no don't send me all of the
results)?" that responds on the order of less than 3 seconds versus
having to retrieve all results for any given question. If it takes
too long to get that answer, then wastes development time. This is
coupled with the next requirement.
Incremental result retrieval: I should be able to then ask "Give me
just the next N results of this request", and then a handle to the
result set so that I can ask the question repeatedly, thus
incrementally pulling out the results in stages. This means I
should not have to wait for the entire result set before seeing
some subset of all of the results. And that I can cancel the
operation safely if I have seen enough results. Reason: I need to
answer the question: "What is the build or development impact of
changing some particular function signature?"
Resilient to small code changes: If I change a header or source
file, I should not have to wait for the entire code base to be
rescanned, but only that header or source file
rescanned. Rescanning should be quick. E.g., don't do what cscope
requires you to do, which is to rescan the entire code base for
small changes. It is understood that if you change a header, then
scanning can take longer since other files that include that header
would have to be rescanned.
IDE Agnostic: Is text editor agnostic (don't make me use a specific
text editor; I've made my choice already, thank you!)
Platform Agnostic: Is platform-agnostic (don't make me only use it
on Linux or only on Windows, as I have to use both of those
platforms in my daily grind, but I need the tool to be useful on
both as I have code sandboxes on both platforms).
Non-binary: Should not cost me anything other than time to download
and compile the library and all of its dependencies.
Not trial-ware.
Actively Supported: It is likely that sending help requests to mailing lists
or associated forums is likely to get a response in less than 2
days.
Network agnostic: Databases the library builds should be able to be used directly on
a network from 32-bit and 64-bit systems, both Linux and Windows
interchangeably, at the same time, and do not embed hardcoded paths
to filesystems that would otherwise "root" the database to a
particular network.
Build environment agnostic: Does not require intimate knowledge of my build environment, with
the notable exception of possibly requiring knowledge of compiler
supplied CPP macro definitions (e.g. -Dmacro=value).
I would say that CLang Index is a close fit. However I don't think that it stores data in a database.
Anyway the CLang framework offer what you actually need to build a tool tailored to your needs, if only because of its C, C++ and Objective-C parsing / indexing capabitilies. And since it's provided as a set of reusable libraries... it was crafted for being developed on!
I have to admit that I haven't used either because I work with a lot of Microsoft-specific code that uses Microsoft compiler extensions that i don't expect them to understand, but the two open source analyzers I'm aware of are Mozilla Pork and the Clang Analyzer.
If you are looking for results of code analysis (metrics, graphs, ...) why not use a tool (instead of API) to do that? If you can, I suggest you to take a look at Understand.
It's not free (there's a trial version) but I found it very useful.
Maybe Doxygen with GraphViz could be the answer of some of your constraints but not all,for example the analysis of Doxygen is not incremental.

Reading/Understanding third-party code

When you get a third-party library (c, c++), open-source (LGPL say), that does not have good documentation, what is the best way to go about understanding it to be able to integrate into your application?
The library usually has some example programs and I end up walking through the code using gdb. Any other suggestions/best-practicies?
For an example, I just picked one from sourceforge.net, but it's just a broad engineering/programming question:
http://sourceforge.net/projects/aftp/
I frequently use a couple of tools to help me with this:
GNU Global. It generates cross-referencing databases and can produce hyperlinked HTML from source code. Clicking function calls will take you to their definitions, and you can see lists of all references to a function. Only works for C and perhaps C++.
Doxygen. It generates documentation from Javadoc-style comments. If you tell it to generate documentation for undocumented methods, it will give you nice summaries. It can also produce hyperlinked source code listings (and can link into the listings provided by htags).
These two tools, along with just reading code in Emacs and doing some searches with recursive grep, are how I do most of my source reverse-engineering.
One of the better ways to understand it is to attempt to document it yourself. By going and trying to document it yourself, it forces you to really dive in and test and test and test and make sure you know what each statement is doing at what times. Then you can really start to understand what the previous developer may have been thinking (or not thinking for that matter).
Great question. I think that this should be addressed thoroughly, so I'm going to try to make my answer as thorough as possible.
One thing that I do when approaching large projects that I've either inherited or contributing to is automatically generate their sources, UML diagrams, and anything that can ease the various amounts of A.D.D. encountered when learning a new project:)
I believe someone here already mentioned Doxygen, that's a great tool! You should look into it and write a small bash script that will automatically generate sources for the application you're developing in some tree structure you've setup.
One thing that I've haven't seen people mention is BOUML! It's fantastic and free! It automatically generates reverse UML diagrams from existing sources and it supports a variety of languages. I use this as a way to really capture the big picture of what's going on in terms of architecture and design before I start reading code.
If you've got the money to spare, look into Understand for %language-here%. It's absolutely great and has helped me in many ways when inheriting legacy code.
EDIT:
Try out ack (betterthangrep.com), it is a pretty convenient script for searching source trees:)
Familiarize yourself with the information available in the headers. The functions you call will be declared there. Then try to identify the valid arguments and pre-/post-conditions of the functions, as those are your primary guidance (even if they are not documented!). The example programs are your next bet.
If you have code completion/intellisense I like opening up the library and going '.' or 'namespace::' and seeing what comes up. I always find it helpful, you can navigate through the objects/namespaces and see what functionality they have. This is of course assuming its an OOP library with relatively good naming of functions/objects.
There really isn't a silver bullet other than just rolling up your sleeves and digging into the code.
This is where we earn our money.
Three things;
(1) try to run the test or example apps available, set low debug levels, and walk through logs.
(2) use source navigator tool / cscope ( available both on windows and linux) and browse the code to understand the flow.
(3) also in parallel use gdb to walk into code while running test/example apps.

Finding very similar program executions

I was wondering if its possible / anyone knows any tools out there to compare the execution of two related programs (for example, assignments on a class) to see how similar they are. For example, not to compare the names of functions, but how they use syscalls. One silly case of this would be testing if a C string is printed as (see example below) in more than one case one separate program.
printf("%s",str)
Or as
for (i=0;i<len;i++) printf("%c",str[i]);
I havenĀ“t put much thought into this, but i would imagine that strace / ltrace (maybe even oprofile) would be a good starting point. Particularly, this is for UNIX C / C++ programs.
Thanks.
If you have access to the source code of the two programs, you may build a graph of the functions (each function is a node, and there is an edge from A to B if A calls B()), and compute some graph similarity metrics. This will catch a source code copy made by renaming and reorganizing.
An initial idea would be to use ltrace and strace to log the calls and then use diff on the logs. This would obviously only cover the library an system calls. If you need a more fine granular logging, the oprofile might help.
If you have access to the source code you could instrument your code by compiling it with profiling information and then parse the gcov output after the runs. A pure static source code analysis may be sufficient if your code is not taking different routes depending on external data/state.
I think you can do this kind of thing using valgrind.
A finer-grained version (and depending on what is the access to the program source and what you exactly want in terms of comparison) would be to use kprobes.
Kernel Dynamic Probes (Kprobes) provides a lightweight interface for kernel modules to implant probes and register corresponding probe handlers. A probe is an automated breakpoint that is implanted dynamically in executing (kernel-space) modules without the need to modify their underlying source. Probes are intended to be used as an ad hoc service aid where minimal disruption to the system is required. They are particularly advocated in production environments where the use of interactive debuggers is undesirable. Kprobes also has substantial applicability in test and development environments. During test, faults may be injected or simulated by the probing module. In development, debugging code (for example a printk) may be easily inserted without having to recompile to module under test.