Generating a tree of all possible call stacks - c++

I am trying to tinker with some library code written in C++. A fairly complex application sits on top of the library. To tinker with the code, I often need to understand how a library function has been used throughout the codebase, and make sure that I am not breaking any downstream clients.
Suppose foo() is exported from my library's dll. In client code, bar() calls foo(), and baz() calls bar(). I need to make sure that bar and and baz both work after my changes. In my case, the call stack actually is quite deep, and not easy to manually trace because there is not one call stack, there are numerous ways my library function can land at the top of a call stack.
Using either Visual Studio, or g++, or clang, is there a way to generate a tree such that my library function is at the root, and the branches are all the various ways my function can land at the top of the call stack? I mean does such a feature already exist in one of the popular toolchains? If not, do you know any other way of generating such a tree?

I don't think any of the compilers have options to generate this information.
In the general case, there are many confounding factors that would make this very difficult:
If there's recursion in the code, then the tree you want is actually a graph/network with cycles.
Virtual methods, function pointers, and member function pointers probably make this the equivalent of the halting problem. If you have two concrete classes A and B that share a common base class that offers virtual method foo(), then you'd have to do exhaustive analysis to determine whether a particular call of foo() through a pointer or reference to the base class should be counted as a call to A::foo() or B::foo() or both. Ditto for the various flavors function pointers.
If you rely on system or other third-party libraries that can call back into your code, you'd better have source for them. For example, a Windows GUI program typically has window procedures that are called from system code, possibly in response to a call from your code into the system. Since you don't wouldn't have the windows sources, you'd have to assume that any and all of your callbacks could be called at any time, and thus your "tree" would have many roots.
The modern way to deal with this is not to analyze all the ways your library can be called, but to document all the ways it should be called. Build a test suite that calls the library in all the reasonable ways you want to support. Then you can tinker and then run your test suite to see if you've broken the library's contract. If, in integration testing, you find a client of the library that's broken by your changes, it indicates the test suite is incomplete or the client is calling the library in an unsupported way.

Related

Is there a way in C++ to implement Coroutines that apply to multiple levels of function calls?

I wish to implement a way in C++ to be able to midway through a function cease execution within the function and return up some number of function calls. For example, I might have a function F and I want this code to return all the way up to F without having to have any special code within F or the functions F calls other than the one performing this return. I then want to be able to at some point execute a piece of code that resumes all execution within the original location. I imagine this can be done via stack manipulation but I have no idea if there is an easier way to do this.
Given that we're dealing with DLL's, it's pretty safe to say that we're dealing with Windows. Windows has no real notion of "exiting" a DLL "back" to the executable. Windows does have an idea how standard function calls work (WINAPI), but doesn't even require that the functions from GetProcAddress are WINAPI calls. And that's just about the functions on external interfaces, inline functions don't need to obey the rules for GetProcAddress. In fact, Windows doesn't really require your code to be built from functions at all. A Finite State Compiler may emit code which uses jumps instead of calls.
So, the challenge here is that your executable has an arbitrary ABI, GCC has its own ABI, and the two are entirely incompatible. You figures out that much when you implemented your own "queue" mechanism. There's no generic mechanism possible.

C++ calling conventions -- converting between win32 and Linux/GCC?

I'm knee deep in a project to connect Windows OpenVR applications running in Wine directly to Linux native SteamVR via a Winelib wrapper, the idea being to sidestep all the problems with trying to make what is effectively a very complicated device driver run inside Wine itself. I pretty much immediately hit a wall. The problem appears to be related to calling conventions, though I've had trouble getting meaningful information out of winedbg, so there's a chance I'm way way off.
The OpenVR API is C++ and consists primarily of classes filled with virtual methods. The application calls a getter (VR_GetGenericInterface) to acquire a derivative class object implementing those methods from the (closed source) runtime. It then calls those methods directly on the object given to it.
My current attempt goes like this: My wrapped VR_GetGenericInterface returns a custom wrapper class for the requested interface. This class's methods are defined in their own file separately compiled with -mabi=ms. It calls non-member methods in a separate file that is compiled without -mabi=ms, which finally make the corresponding call into the actual runtime.
This seems to work, until the application calls a method that returns a struct of some description. Then the application segfaults on the line the call happened, apparently just after the call returned, as my code isn't anywhere on the stack at that point and I verified with printfs that my wrapped class reaches the point of returning to the app.
This leads me to think there's yet another calling convention gotcha I haven't accounted for yet, related to structs and similar complex types. From googling it appears returning a struct is one of the more poorly-defined things in ABIs typically, but I found no concrete answers or descriptions of the differences.
What is most likely happening, and how can I dig deeper to see what exactly the application is expecting?

Hook functions class in offset (from .exe in a .dll)

I have a .exe application and I need to create some customizations to this executable, so I need to hook a dll in it for the changes to be loaded. Until then, everyone knows.
The scenario is this:
Hook(0xOffset, &myClass::myFunc);
There is a class in .exe that I need to rewrite completely and I've done that in my dll, but I'm having trouble with the hook of the functions of class, they aren't static. I've read many topics and I could not implement it with any method presented. In some cases, the compiler will not accept, in others cases has accepted but the .exe could not find the actual address of the function.
A friend gave me a solution, but it is a little confusing to understand how I can call the function there and from what I saw would be very big in my source code and many loops, so to speak.
Could help me?
Member functions are indeed far more complex. You have to deal with normal inheritance, multiple inheritance, and virtual inheritance; with direct calls and virtual calls. Possibly the worst is dealing with member function pointers, which are entirely unlike normal function pointers.
As a result, many solutions deal only with the easy cases. It's perfectly normal that a solution capable of dealing with all edge cases takes a lot of code.

tool for finding which functions can ultimately cause a call to a (list of) low level functions

I have a very large C++ program where certain low level functions should only be called from certain contexts or while taking specific precautions. I am looking for a tool that shows me which of these low-level functions are called by much higher level functions. I would prefer this to be visible in the IDE with some drop down or labeling, possibly in annotated source output, but any easier method than manually searching the call-graph will help.
This is a problem of static analysis and I'm not helped by a profiler.
I am mostly working on mac, linux is OK, and if something is only available on windows then I can live with that.
Update
Just having the call-graph does not make it that much quicker to answer the question, "does foo() potentially cause a call to x() y() or z()". (or I'm missing something about the call-graph tools, perhaps I need to write a program that traverses it to get a solution?)
There exists Clang Static Analyzer which uses LLVM which should also be present on OS X. Actually i'm of the opinion that this is integrated in Xcode. Anyway, there exists a GUI.
Furthermore there are several LLVM passes, where you can generate call graphs, but i'm not sure if this is what you want.
The tool Scientific Toolworks "Understand" tool is supposed to be able to produce call graphs for C and C++.
Doxygen also supposedly produces call graphs.
I don't have any experience with either of these, but some harsh opinions. You need to keep in mind that I'm a vendor of another tool, so take this opinion with a big grain of salt.
I have experience building reasonably accurate call graphs for massive C systems (25 million lines) with 250,000 functions.
One issue I encounter in building a realistic call graph are indirect function calls, and for C++, overloaded method function calls. In big systems, there are a lot of both of these. To determine what gets called when FOO gets invoked, your tool has to have to deep semantic understanding of how the compiler/language resolves an overloaded call, and for indirect function calls, a reasonably precise determination of what a function pointer might actually point-to in a big system. If you don't get these reasonably right, your call graph will contain a lot of false positives (e.g., bogus claims of A calls B), and on scale false positives are a disaster.
For C++, you must have what amounts to the full compiler front end. Neither Understand or Doxygen have this, so I don't see how they can actually understand C++'s overloading/Koenig lookup rules. Neither Understand or Doxygen make any attempt that I know of to reason about indirect function calls.
Our DMS Software Reengineering Toolkit does build calls graphs for C reasonably well, even with indirect function pointers, using a C-language precise front end.
We have C++ language precise front end, and it does the overload resolution correctly (to the extent the C++ committee agrees on it, and we understand what they said, and what the individual compilers do [they don't always agree]), and we have something like Doxygen that shows this information. We don't presently have function pointer analysis for C++ but we are working on it (we have full control flow graphs within methods and that's a big step).
I understand CLANG has some option for computing call graphs, and I'd expect that to be accurate on overloads since Clang is essentially a C++ compiler implemented with a bunch of components. I don't know what, if anything Clang does to analyze function pointers.

Change the logic of application at run-time

I was wondering if it is possible to change the logic of an application at runtime? Meybe we could replace the implementation of an abstract class with another implementation? Or maybe we could replace a shared library at runtime...
update: Suppose that I've got two implementations of function foo(x, y) and can use any of them based on strategy pattern. Now I want to know if it's possible to add a third implementation of foo(x, y) without restarting the application.
You can use a plugin (a library that you will load at runtime) that expose a new foo function.
I remember we implemented something similar at school, a calculator in which we could add new operations at runtime, without having to restart the program. See dlsym and dlopen.
Addenda
Be very careful when dlclose-ing a plugin that it is not still used in some active call stack frame. On Linux you can call many thousands of times dlopen (so you could accept not dlclose-ing plugins, with some address space leak).
Exactly, as you said "replace the implementation of an abstract class with another implementation" if by it you mean, you can use runtime polymorphism and change the instances of concrete classes with instances of another set of concrete classes.
More specifically, there is a well-known pattern called Strategy pattern exactly for this purpose. Have a look at the wiki page, as it explains this very nicely, even with a code example along with diagram.
C++ mechanism of virtual functions does not allow you to change the implementation at run-time.
However, you can implement whatever implementation change at runtime with function pointers.
Here is an article on self-modifying code that I read recently: http://mainisusuallyafunction.blogspot.com/2011/11/self-modifying-code-for-debug-tracing.html