Tracking c/c++ data structure sizes

Tracking c/c++ data structure sizes - c++

I am trying to find a tool that can show me information about all the data structures in a program. I want to know when certain data structures were accessed and how their sizes changed throughout the course of the program. For example I want the tool to know that all the nodes in a linked list belong to one single data structure. Does a tool like this exist? I couldn't seem to find one through googling. Thanks

Some Toolchain, for example, Xcode's Toolchain, provides debugging features, which allows you to keep track of the memory use, CPU times and network using. The tracking data structure in memory could be achieved if you set breakpoint in the program. Without breakpoint, it's not likely to track the change of data structure since the CPU usually runs pretty fast. What you need is a good IDE with debugging, profiling ...

My first question is: what's your compiler? One person mentioned gdb as a useful tool, but that's only the case if you're using gcc/g++. Xcode has its own compiler/debugger. MicroSoft has its own as well.
Ultimately, this is about knowing how to use the debugger for your compiler. Also, realize that using the debugger for your compiler properly can be just as daunting a task as learning how to use your compiler.
There are also profilers available, but again, it will depend somewhat on your compiler as to which ones are available for you. Your keywords for googling will be "C++", "debugger", and "profiler", ideally along with the name of your compiler.
Be aware, as well, that your compiler may impact the statistics when your program runs against the same data.

Related

Teaching gdb to understand micro-threads from core files

I am working on a huge program that employs a (custom built) micro-threading solution. It sometimes happens that I need debug a crash. During such times, it is useful to be able to switch from one micro-thread to another.
If I'm doing live debugging, I can replace all of the registers to those that came from the micro-thread context. I have written a macro to do just that, and it works really well.
The problem is that I cannot change the register values if I am doing post-mortem debugging (from a core file). In such a case, I have no way to tell GDB to change its concept of what the current frame is, as all registers are considered read-only in that case.
Is there a way to tell GDB about my custom context management?
Shachar

There's not a simple, built-in way to do this in gdb.
I think probably the simplest way would be to write a version of gdbserver that can read your core files and that presents your micro-threads to gdb as real threads. There's been at least one gdbserver out there that can read core files already, so maybe it isn't crazily hard. However, I couldn't really say for sure.

Edit and Continue on GDB

I know that E&C is a controversial subject and some say that it encourages a wrong approach to debugging, but still - I think we can agree that there are numerous cases when it is clearly useful - experimenting with different values of some constants, redesigning GUI parameters on-the-fly to find a good look... You name it.
My question is: Are we ever going to have E&C on GDB? I understand that it is a platform-specific feature and needs some serious cooperation with the compiler, the debugger and the OS (MSVC has this one easy as the compiler and debugger always come in one package), but... It still should be doable. I've even heard something about Apple having it implemented in their version of GCC [citation needed]. And I'd say it is indeed feasible.
Knowing all the hype about MSVC's E&C (my experience says it's the first thing MSVC users mention when asked "why not switch to Eclipse and gcc/gdb"), I'm seriously surprised that after quite some years GCC/GDB still doesn't have such feature. Are there any good reasons for that? Is someone working on it as we speak?

It is a surprisingly non-trivial amount of work, encompassing many design decisions and feature tradeoffs. Consider: you are debugging. The debugee is suspended. Its image in memory contains the object code of the source, and the binary layout of objects, the heap, the stacks. The debugger is inspecting its memory image. It has loaded debug information about the symbols, types, address mappings, pc (ip) to source correspondences. It displays the call stack, data values.
Now you want to allow a particular set of possible edits to the code and/or data, without stopping the debuggee and restarting. The simplest might be to change one line of code to another. Perhaps you recompile that file or just that function or just that line. Now you have to patch the debuggee image to execute that new line of code the next time you step over it or otherwise run through it. How does that work under the hood? What happens if the code is larger than the line of code it replaced? How does it interact with compiler optimizations? Perhaps you can only do this on a specially compiled for EnC debugging target. Perhaps you will constrain possible sites it is legal to EnC. Consider: what happens if you edit a line of code in a function suspended down in the call stack. When the code returns there does it run the original version of the function or the version with your line changed? If the original version, where does that source come from?
Can you add or remove locals? What does that do to the call stack of suspended frames? Of the current function?
Can you change function signatures? Add fields to / remove fields from objects? What about existing instances? What about pending destructors or finalizers? Etc.
There are many, many functionality details to attend to to make any kind of usuable EnC work. Then there are many cross-tools integration issues necessary to provide the infrastructure to power EnC. In particular, it helps to have some kind of repository of debug information that can make available the before- and after-edit debug information and object code to the debugger. For C++, the incrementally updatable debug information in PDBs helps. Incremental linking may help too.
Looking from the MS ecosystem over into the GCC ecosystem, it is easy to imagine the complexity and integration issues across GDB/GCC/binutils, the myriad of targets, some needed EnC specific target abstractions, and the "nice to have but inessential" nature of EnC, are why it has not appeared yet in GDB/GCC.
Happy hacking!
(p.s. It is instructive and inspiring to look at what the Smalltalk-80 interactive programming environment could do. In St80 there was no concept of "restart" -- the image and its object memory were always live, if you edited any aspect of a class you still had to keep running. In such environments object versioning was not a hypothetical.)

I'm not familiar with MSVC's E&C, but GDB has some of the things you've mentioned:
http://sourceware.org/gdb/current/onlinedocs/gdb/Altering.html#Altering
17. Altering Execution
Once you think you have found an error in your program, you might want to find out for certain whether correcting the apparent error would lead to correct results in the rest of the run. You can find the answer by experiment, using the gdb features for altering execution of the program.
For example, you can store new values into variables or memory locations, give your program a signal, restart it at a different address, or even return prematurely from a function.
Assignment: Assignment to variables
Jumping: Continuing at a different address
Signaling: Giving your program a signal
Returning: Returning from a function
Calling: Calling your program's functions
Patching: Patching your program
Compiling and Injecting Code: Compiling and injecting code in GDB

This is a pretty good reference to the old Apple implementation of "fix and continue". It also references other working implementations.
http://sources.redhat.com/ml/gdb/2003-06/msg00500.html
Here is a snippet:
Fix and continue is a feature implemented by many other debuggers,
which we added to our gdb for this release. Sun Workshop, SGI ProDev
WorkShop, Microsoft's Visual Studio, HP's wdb, and Sun's Hotspot Java
VM all provide this feature in one way or another. I based our
implementation on the HP wdb Fix and Continue feature, which they
added a few years back. Although my final implementation follows the
general outlines of the approach they took, there is almost no shared
code between them. Some of this is because of the architectual
differences (both the processor and the ABI), but even more of it is
due to implementation design differences.
Note that this capability may have been removed in a later version of their toolchain.
UPDATE: Dec-21-2012
There is a GDB Roadmap PDF presentation that includes a slide describing "Fix and Continue" among other bullet points. The presentation is dated July-9-2012 so maybe there is hope to have this added at some point. The presentation was part of the GNU Tools Cauldron 2012.
Also, I get it that adding E&C to GDB or anywhere in Linux land is a tough chore with all the different components.
But I don't see E&C as controversial. I remember using it in VB5 and VB6 and it was probably there before that. Also it's been in Office VBA since way back. And it's been in Visual Studio since VS2005. VS2003 was the only one that didn't have it and I remember devs howling about it. They intended to add it back anyway and they did with VS2005 and it's been there since. It works with C#, VB, and also C and C++. It's been in MS core tools for 20+ years, almost continuous (counting VB when it was standalone), and subtracting VS2003. But you could still say they had it in Office VBA during the VS2003 period ;)
And Jetbrains recently added it too their C# tool Rider. They bragged about it (rightly so imo) in their Rider blog.

What's a very easy C++ profiler (VC++)?

I've used a few profilers in the past and never found them particularly easy. Maybe I picked bad ones, maybe I didn't really know what I was expecting!
But I'd like to know if there are any 'standard' profilers which simply drop in and work? I don't believe I need massively fine-detailed reports, just to pick up major black-spots. Ease of use is more important to me at this point.
It's VC++ 2008 we're using (I run standard edition personally). I don't suppose there are any tools in the IDE for this, I can't see any from looking at the main menus?

I suggest a very simple method (which I learned from reading Mike Dunlavey's posts on SO):
Just pause the program.
Do it several times to get a reasonable sample. If a particular function is taking half of your program's execution time, the odds are that you will catch it in the act very quickly.
If you improve that function's performance by 50%, then you've just improved overall execution time by 25%. And if you discover that it's not even needed at all (I have found several such cases in the short time I've been using this method), you've just cut the execution time in half.
I must confess that at first I was quite skeptical of the efficacy of this approach, but after trying it for a couple of weeks, I'm hooked.

VS built in:
If you have team edition you can use the Visual Studio profiler.
Other options:
Otherwise check this thread.
Creating your own easily:
I personally use an internally built one based on the Win32 API QueryPerformanceCounter.
You can make something nice and easy to use within a hundred lines of code or less.
The process is simple: create a macro at the top of each function that you want to profile called PROFILE_FUNC() and that will add to internally managed stats. Then have another macro called PROFILE_DUMP() which will dump the outputs to a text document.
PROFILE_FUNC() creates an object that will use RAII to log the amount of time until the object is destroyed. Both the constructor of this RAII object and the destructor will call QueryPerformanceCounter. You could also leave these lines in your code and control the behavior via a #define PROFILING_ON

I always used AMD CodeAnalyst, I find it quite easy to use and gives interesting results. I always used the time based profile, in which I found that it cooperates well with my apps' debug information, letting me find where the time is spent at procedure, C++ instruction and single assembly instruction level.

I used lt prof in the past for a quick run down of my C++ app. It works pretty easy and runs with a compiled program, does not need and source code hooks or tweaks. There is a trial version available I believe.

A very simple (and free) way to profile is to install the Windows debuggers (cdb/windbg), set a bp on the place of interest, and issue the wt command ("Trace and Watch Data"). Check out MSDN for more info.

Another super simple and useful profiling workflow that works on any programming languages is to comment out blocks of codes. After commenting out all of them, uncomment some and run your program to see the performance. If your program starts to run very slow when some code has been uncommented, then you'll probably want to check the performance there.

Tool for analyzing C++ sources (MSVC)

I need a tool which analyzes C++ sources and says what code isn't used. Size of sources is ~500mb

PC-Lint is good. If it needs to be free/open source your choices dwindle. Cppcheck is free, and will check for unused private functions. I don't think that it looks for things like uninstantiated classes like PC-Lint.

Once again, I'll throw AQTime into the discussion. Has static code analysis for most, if not all, of the supported languages. I didn't really go into that part though, I mainly used the dynamic profilers (memory, performance and so on).

You could use a code coverage tool (dynamic analysis) to get an idea of what code isn't
being executed, and then hand analyze to see if that code is really useless.
If you want a static analysis, you need a tool that can read the entire
500Mb of source code (est. 20 million lines? Wow!) and compute a
conservative estimate of what is used. This requires doing a points-to
analysis over the entire system.
Here's why: If you leave out any module Z, and
decide that FOO is unused, you
might find out later that Z happened to be the one that used FOO,
or more subtly, Z copied a pointer value that happened to have
&FOO in it to a third module M that in turn called the "unused" function
throught the pointer.
What this means is that no static analysis tool that reads just
single modules (compilation units) can answer this question safely.
And at your scale, you can't afford to make dumb mistakes.
My company, Semantic Designs has done points-to analysis for 35 million line systems
of C code using our DMS Software Reengineering Toolkit. DMS
can read very large systems of source code. It required
a custom tool, not so much because the source code was in an odd (archiac)
dialect of C++ (systems in extremely modern dialects can't be this big,
not enough time to code them!), but rather because in very large systems
there are other peculiar factors at play. For the C system we did,
there was a custom dynamic linker, and that affected the points-to analysis,
which in turn had to be customized.
Because systems of the scale you are discussing alway have surprises like this (BIBSEH: "Because In Big Systems, Everything Happens"), you will
likely need a custom tool to answer the question. DMS is designed
to be customized.
See http://www.semanticdesigns.com/Products/DMS/DMSToolkit.html
and http://www.semanticdesigns.com/Products/FrontEnds/CppFrontEnd.html

Code coverage tool is what you need, but you will have to run our program through all functionality and see what is repoted as unused. Since the code could be DLL exported functions you will have to make sure nothing uses them externally. Some code coverage tools: Purify, CTC++, Boundschecker may have code coverage functionality if I remember right and a bunch of other tools.
Be very careful about removing any function that may have been exported without knowing what external program may be linking/using it.

Debugging Best Practices for C++ STL/Boost with gdb

Debugging with gdb, any c++ code that uses STL/boost is still a nightmare. Anyone who has used gdb with STL knows this. For example, see sample runs of some debugging sessions in code here.
I am trying to reduce the pain by collecting tips. Can you please comment on the tips I have collected below (particularly which ones you have been using and any changes you would recommend on them) - I have listed the tips is decreasing order of technicality.
Is anyone using "Stanford GDB STL utils" and "UCF GDB utils"? Is there some such utils for boost data structures? The utils above do not seem to be usable recursively, for example for printing vector of a boost::shared_ptr in a legible manner within one command.
Write your .gdbinit file. Include, for example, C++ related beautifiers, listed at the bottom of UCF GDB utils.
Use checked/debug STL/Boost library, such as STLport.
Use logging (for example as described here)
Update: GDB has a new C++ branch.

Maybe not the sort of "tip" you were looking for, but I have to say that my experience after a few years of moving from C++ & STL to C++ & boost & STL is that I now spend a lot less time in GDB than I used to. I put this down to a number of things:
boost smart pointers (particularly "shared pointer", and the pointer containers when performance is needed). I can't remember the last time I had to write an explicit delete (delete is the "goto" of C++ IMHO). There goes a lot of GDB time tracking down invalid and leaking pointers.
boost is full of proven code for things you'd probably hack together an inferior version of otherwise. e.g boost::bimap is great for the common pattern of LRU caching logic. There goes another heap of GDB time.
Adopting unittesting. boost::test's AUTO macros mean it's an absolute doddle to set up test cases (easier than CppUnit). This catches lots of stuff long before it gets built into anything you'd have to attach a debugger to.
Related to that, tools like boost::bind make it easier to design-for-test. e.g algorithms can be more generic and less tied up with the types they operate on; this makes plugging them into test shims/proxies/mock objects etc easier (that and the fact that exposure to boost's template-tasticness will encourage you to "dare to template" things you'd never have considered before, yielding similar testing benefits).
boost::array. "C array" performance, with range checking in debug builds.
boost is full of great code you can't help but learn from

You might look at:
Inspecting standard container (std::map) contents with gdb

I think the easiest and most option is to use logging (well I actually use debug prints, but I think that's not a point). The biggest advantage is that you can inspect any type of data, many times per program execution and then search it with a text editor to look for interesting data. Note that this is very fast. The disadvantage is obvious, you must preselect the data which you want to log and places where to log. However, that is not such a serious issue, because you usually know where in the code bad things are happening (and if not, you just add sanity checks here and there and then, you will know).
Checked/debug libraries are good, but they are better as a testing tool (eg. run it and see if I'm doing anything wrong), and not as good at debugging a specific issue. They can't detect a flaw in user code.
Otherwise, I use plain GDB. It is not that bad as it sounds, although it might be if you are scared by "print x" printing a screenful of junk. But, if you have debugging information, things like printing a member of a std::vector work and if anything fails, you still can inspect the raw memory by the x command. But if I know what I'm looking for, I use option 1 - logging.
Note that the "difficult to inspect" structures are not only STL/Boost, but also from other libraries, like Qt/KDE.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js