Why is gdb backtrace output so ugly? - gdb

Some complex C++ program has many lambda calls. When I invoke bt at gdb prompt, it gives me this:
I have to say it's too ugly to understand. How hard to make that readable and hierarchical for gdb community, so that I can easily know who calls who ?
Do lambda expressions make it hard to do ?

There's not really a reason things are ugly per se. Instead what is happening is that gdb simply prints the name of the symbol that it knows, and sometimes C++ symbol names are just unreadably long.
Maybe gdb could be modified to do something about this, but again, there isn't really a reason (which IIUC is what you're asking about) -- just that nobody has tried.
It's also possible to write a Python frame filter for gdb that will shorten symbol names as you like. This may be best because in general it seems difficult to figure out how to shorten names.

Related

Debugging Template code: What are some modern ways to do this on the command line?

Now this works, but it's pretty ugly, and not too friendly when you need to do it a whole lot.
(gdb) break diff_match_patch<std::__1::basic_string<wchar_t, std::__1::cha
r_traits<wchar_t>, std::__1::allocator<wchar_t> >, diff_match_patch_traits
<wchar_t> >::diff_linesToCharsMunge
Breakpoint 4 at 0x100021a1c: file diff_match_patch.h, line 658.
(gdb)
I am able to set breakpoints from the line number. This is likely what I will resort to most of the time, because:
To set a breakpoint with the name of a function, if it happens to be underneath a pile of templates (such as a class inheriting STL containers... grumble grumble), you have to go through this ordeal to get the ungodly signature of it in order to break on a function.
Here's how I'm able to accomplish it, here on my OS X 10.8.4, you type break at the gdb prompt, then start going with the prefix (in this case it was diff_match_patch, the name of the class), then attempt to tab-complete. At this point I use tmux's search mode to isolate the signature of the function (diff_linesToCharsMunge) and you see what I have pasted there is the beginning portion of the function's signature, which is enough to get gdb to happily set a breakpoint on it.
Now I'd like to make this somewhat better, but I'm not sure how. I'd like to keep things clean and robust by staying on the command line.
I reckon that if I wanted a nice (and friendly) GUI interface a good way to go would be to set this code up in an XCode project and go from there.
But what sort of options are there for me if I want to take more of a core UNIX style approach?
gdb is solid, for sure (and boy am I glad it seems to work seamlessly with my clang-compiled C++11 code), but it's lacking many niceties like what we get from e.g. ipython. ipython has full blown syntax highlighting! The clang compiler produces nicely colored and super friendly (cf. gcc at least) errors and warnings. With the sweet squigglies to show you the parts of offending expressions! gdb is just feeling long in the tooth by comparison.
So I'm just trying to make this work a little less ... painful. The areas to improve...
have tab completion search from the middle of the function signature so I can just tab after typing the reasonable human-known name of the function
actually #1 about sums it up, but any viable workarounds that do other things (line number is one of them, i guess... sigh) are welcome. You know, I'm hoping maybe clang project has some epic feature rich gdb front-end.
Oh right, I knew about this, but had forgotten it. Just stumbled across another answer that mentioned it:
http://lldb.llvm.org/lldb-gdb.html

Variable renaming for plagiarism detection for C/C++

I have a couple of simple C++ homeworks and I know the students shared code. These are smart students and they know how to cheat moss. I'm looking for a tool that can rename variables based on their types (first variable of type int will be int1, first int array will be intptr1...), or does something similar that I cannot think of now. Do you know a quick way to do this?
edit: I'm required to use moss and report 90% match
Thanks
Yep, the tool you're looking for is called a compiler. :)
Seriously, if the programs submitted are exactly the same except for the identifier names, compiling then (without debugging info) should result in exactly the same output.
If you do this with debugging turned on, the compiler may leave meta-data in the executable that is different for each executable, hence the comment about ensuring it is off. This is also why this wont work for Java programs - that kind of info is present whether in debug mode or not (for the purposes of dynamic introspection).
EDIT: I see from the comments added to the question that you're observing some submissions that are different in more than just identifier names. If the programs are still structurally equivalent, this should still work.
EDIT: Given that the use of moss is a requirement, this probably isn't the way to go. I does seem though that moss has some support for comparing assembly - perhaps compiling to assembler and submitting that to moss is an option (depending on what compiler you're using).
You can download and try our C CloneDR duplicate code detector. It finds duplicated code even when the variable names have been changed. Multiple changes in the same chunk are treated as just one; if they rename the varaibles consistenly everywhere, you'll get back a report of "one clone" with the precise variable subsitution.
You can try Copy Paste Detector with ignoreIdentifiers turned on. You can at least use it for a first pass before going to the effort of normalizing names for moss. Or, since the source is available, maybe you can get it to spit out its internal normalization of the code.
Another way of doing this would be to compile the applications and compare their binaries, so your examination is not limited to variable/function name changing.
An HEX editor can help you with that. I just tried ExamDiff (not free $) and I was happy with the result.

print the code of a function in a DLL

I want to print the code of a function in a DLL.
I loaded the dll, I have the name of the desired function, what's next?
Thank you!
Realistically, next is getting the code. What you have in the DLL is object code -- binary code in the form ready for the processor to execute, not ready to be printed.
You can disassemble what's in the DLL. If you're comfortable working with assembly language, that may be useful, but it's definitely not the original source code (nor probably anything very close to it either). If you want to disassemble it, loading it in your program isn't (usually) a very good starting point. Try opening a VS command line and using dumpbin /disasm yourfile.dll. Be prepared for a lot of output unless the DLL in question is really tiny.
Your only option to retrieve hints about the actual implemented functionality of said function inside the DLL is to reverse engineer whatever the binary representation of assembly happens to be. What this means is that you pretty much have to use a disassembler(IDA Pro, or debugger, e.g. OllyDbg) to translate the opcodes to actual assembly mnemonics and then just work your way through it and try to understand the details of how it functions.
Note, that since it is compiled from C/C++ there is lots and lots of data lost in the process due to optimization and the nature of the process; the resulting assembly can(and probably will) seem cryptic and senseless, but it still does it's job the exact same way as the programmer programmed it in higher level language. It won't be easy. It will take time. You will need luck and nerves. But it IS doable. :)
Nothing. A DLL is compiled binary code; you can't get the source just by downloading it and knowing the name of the function.
If this was a .NET assembly, you might be able to get the source using reflection. However, you mentioned C++, so this is doubtful.
Check out this http://www.cprogramming.com/challenges/solutions/self_print.html and this Program that prints its own code? and this http://en.wikipedia.org/wiki/Quine_%28computing%29
I am not sure if it will do what you want, but i guess it may help you.

Changing parts of compiled binaries

learned english as a second lang, sorry for the mistakes & awkwardness
I have given a peculiar project to work on. The company has lost the source code for the app, and I have to make changes to it. Now, reverse engineering the whole thing is impossible for one man, its just too huge, however patching individual functions would be feasible, since the changes are not that monumental.
So, one possible solution would be compiling C code and somehow -after rewriting addresses- patching it into the actual binary, ideally, replacing the code the CALL instruction jumps to, or inserting a JMP to my code.
Is there any way to accomplish this using MingW32? If it is, can you provide a simple example? I'm also interested in books which could help me accomplishing the task.
Thanks for your help
I use OllyDBG for this kind of things. It allows you to see the disassembly and debug it, you can place breakpoints etc, and you can also edit the binary. So, you could edit the PE header of that program adding a code section with your (compiled) code inside, then call it from the original program.
I can't give you any advice since I've never tried, although I thought about it many times. You know, lazyness.. :)
I would disassemble the program with a high-quality disassembler that produces something that can be assembled back into a runnable app, and then replace the parts you need to modify with C code.
Something like this will let you reverse the machine code into source. It won't be pretty but it does work.
http://www.hex-rays.com/idapro/
There are also tools for runtime patching http://www.dyninst.org/ for instance. They really aren't made for patching but they can do the trick.
And of course the last choice is to just use an assembler and write machine code :)

Debugging Best Practices for C++ STL/Boost with gdb

Debugging with gdb, any c++ code that uses STL/boost is still a nightmare. Anyone who has used gdb with STL knows this. For example, see sample runs of some debugging sessions in code here.
I am trying to reduce the pain by collecting tips. Can you please comment on the tips I have collected below (particularly which ones you have been using and any changes you would recommend on them) - I have listed the tips is decreasing order of technicality.
Is anyone using "Stanford GDB STL utils" and "UCF GDB utils"? Is there some such utils for boost data structures? The utils above do not seem to be usable recursively, for example for printing vector of a boost::shared_ptr in a legible manner within one command.
Write your .gdbinit file. Include, for example, C++ related beautifiers, listed at the bottom of UCF GDB utils.
Use checked/debug STL/Boost library, such as STLport.
Use logging (for example as described here)
Update: GDB has a new C++ branch.
Maybe not the sort of "tip" you were looking for, but I have to say that my experience after a few years of moving from C++ & STL to C++ & boost & STL is that I now spend a lot less time in GDB than I used to. I put this down to a number of things:
boost smart pointers (particularly "shared pointer", and the pointer containers when performance is needed). I can't remember the last time I had to write an explicit delete (delete is the "goto" of C++ IMHO). There goes a lot of GDB time tracking down invalid and leaking pointers.
boost is full of proven code for things you'd probably hack together an inferior version of otherwise. e.g boost::bimap is great for the common pattern of LRU caching logic. There goes another heap of GDB time.
Adopting unittesting. boost::test's AUTO macros mean it's an absolute doddle to set up test cases (easier than CppUnit). This catches lots of stuff long before it gets built into anything you'd have to attach a debugger to.
Related to that, tools like boost::bind make it easier to design-for-test. e.g algorithms can be more generic and less tied up with the types they operate on; this makes plugging them into test shims/proxies/mock objects etc easier (that and the fact that exposure to boost's template-tasticness will encourage you to "dare to template" things you'd never have considered before, yielding similar testing benefits).
boost::array. "C array" performance, with range checking in debug builds.
boost is full of great code you can't help but learn from
You might look at:
Inspecting standard container (std::map) contents with gdb
I think the easiest and most option is to use logging (well I actually use debug prints, but I think that's not a point). The biggest advantage is that you can inspect any type of data, many times per program execution and then search it with a text editor to look for interesting data. Note that this is very fast. The disadvantage is obvious, you must preselect the data which you want to log and places where to log. However, that is not such a serious issue, because you usually know where in the code bad things are happening (and if not, you just add sanity checks here and there and then, you will know).
Checked/debug libraries are good, but they are better as a testing tool (eg. run it and see if I'm doing anything wrong), and not as good at debugging a specific issue. They can't detect a flaw in user code.
Otherwise, I use plain GDB. It is not that bad as it sounds, although it might be if you are scared by "print x" printing a screenful of junk. But, if you have debugging information, things like printing a member of a std::vector work and if anything fails, you still can inspect the raw memory by the x command. But if I know what I'm looking for, I use option 1 - logging.
Note that the "difficult to inspect" structures are not only STL/Boost, but also from other libraries, like Qt/KDE.