Is it possible to regenerate symbols for an exe? - c++

One of my co-workers shipped a hot fix build to a customer, and subsequently deleted the pdb file. The build in question is crashing (intermittently) and we have a couple of crash dumps. We have all the source code in version control, and can compile it to an equivalent .exe and get symbols for that one. However, those symbols don't match the crash dump exactly. It seems like several of the functions are off by some constant offset, but we've only looked at a handful.
I'd love to be able to do the following (I can fake parts of this manually, but it's a huge amount of work): get a stack trace for each thread in the dump and cast pointers in the dump to the appropriate type and have them show up in the Visual Studio debugger. I'm using 2005, if that matters.
Is there a tool to let us recreate a pdb given the source code, all the .obj files, and the original .exe? Or is there a setting when we compile/link to say "make it exactly like this other exe you just did" or something like that?
Quick update, based on answers so far: I have the exe file that we sent to the customer, just not the pdb that corresponds to it, if that helps. I'd just as soon not send them a new build (if possible), because it takes about a week of running to get the crash dumps, and the customer is already at the "why isn't this already fixed?" stage. (If we do send another build, I'd prefer it to be one that either fixes the problem or has additional debugging in the area of interest, not just the same code.) I know it's possible to do some of this manually with a lot of guesswork; that's what we're currently doing. But it's a pain, so I'm hoping there's a way to automate it.

You cannot recreate a PDB to match a pre-existing executable. The PDB contains a "finger print" that is unique for each compilation. Unless you can make the old PDB magically reappear, you should whack your cow-orker in the back of the head (Gibbs-style, if you watch NCIS), recompile the whole thing, store the PDB somewhere safe, and ship a new executable to your customer, and let the crashes come.

If your build system enables you to recreate any binary from any revision you have in your history, then you should be able to get the build ID from the customer, and regenerate that same exact build ID, along with all the binaries and so forth. That will take a while if you have a large project, of course, but it will also yield the debugging file that you need.
If you have no way to perform an exact reproduction of a build, then look at this situation, think hard about some others that might crop up, and start moving to make it possible to regenerate all successful builds and associated files in the project's history. This will make it much easier to be able to work problems like this in the future.

When you have the sources, it's quite easy to find the correspondence between them and the exe file. Just ask them to send you the exe file along with the crash log and use IDA.
What you are asking is much more difficult than that, considering also that you need it for "one use only".

Related

DWARF diff tool for debug info file

I have a binary that is 10s of MB large without debugging symbols, but with debugging symbols its 100s of MB large. In the normal development cycle I copy the several 100 MB binary (with debug symbols) over a very slow link repeatedly. I am trying to minimize the amount of information I need to send to speed up the transfer.
I have researched binary diff tools like bsdiff and courgette, but the time and memory they take is prohibitive for me given the size of the binary and the frequency I'd like to be able to transfer it. As pointed out in responses, there are ways to mitigate the problem of needing to send the debug info to the remote host. gdbserver won't work in my use case because we'd also like the application to be able to log backtrace information with symbols. Using PC values and addr2line was considered, but keeping the source binary around can be confusing if trying to make forward progress while also testing on a remote machine. Additionally, with multiple developers, having access to debug info on some other developer's machine isn't always easy.
strip can separate out the binary from the debug info, so I was wondering if there were tools to compare and "diff" two debug info files, since that's where 95% of my space is anyway? And between iterations, a lot of the raw content in the debug info file is the same (i.e. the names and relationships, which is painfully verbose in C++).
Using the suggestion from user657267 I have also investigated using -gsplit-dwarf to separate out the .dwo files. This may work, but my concern is that over time core headers will change and cause minor changes to every .dwo file, so I'll end up transferring everything anyway assuming my "base" stays the same, even though most of the content of the .dwo file is unchanged. This could possibly be worked around in interesting ways (e.g. repository of .dwo files), but I'd like to avoid it if possible.
So, assuming I have a DWARF debug info file from a previous compilation, is there a way to compare it to the DWARF debug info file from the current compilation and get something smaller to transfer?
As an absolute last resort, I can write some type of lookup and translation code. But are there convenient tools for viewing, modifying, and then "unmodifying" a DWARF debug info file? I have found tools like pyelftools and DWARF utilities. The former only reads the DIEs, and too slowly in my case, while the latter doesn't work well with C++ and I'm still investigating building it from the latest source.
Along these lines, I have investigated what the dwz tool announced here is doing to see if can be tweaked to borrow DIEs from an already existing (but stale) debug info file. Any tips, documents, or pseudo-code in this direction would also be helpful.
In the normal development cycle I have to copy my several 100 MB binary (with debug symbols) over a very slow link over and over again.
Have to?
Your use case screams for using remote debugging, where all the debug info stays on the development system, and you only have to transfer the stripped binary to the target.
Info about using gdbserver is here.
If, for some reason you can't use gdbserver ...
From this link gcc.gnu.org/wiki/DebugFission, I still don't understand how having a separate dwarf file is going to help me diff easier?
With separate debug info, in a usual compile/debug cycle, you'll be recompiling only a few files, and relinking the final binary.
That means that most of the .o and .dwo files will not be rebuilt, which means that you wouldn't have to re-send the unchanged .dwo files to the target, i.e. you get incremental debug info updates "for free".
Update:
We also use the debug symbols to generate backtraces for exceptions in the running application. I think having the symbols locally is the only option for that use case.
Only if you insist on the backtrace being fully-symbolized with file/line info.
The usual way to deal with this is to have the backtrace contain only PC values (and perhaps function names), and use addr2line on the development system to recover file/line info when necessary.

Restore changes in cpp file after close in Visual studio 2012

I wrote a lot of code in C++ and save. After that I want only try some example code which I find. So I paste this code in this my project main.cpp file (where I had my code). I try it example code and mistake close this file. After that I open main.cpp file, but I can't undo changes by Ctrl-Z. I wanted only try example code and then I wanted undo changes by Ctrl Z, but my mistake is close file. Is it possible undo changes after close file or restore it?
Your original code is probably gone for good. However, perhaps this is a good time for you to consider adding a version control system to your tool set, which will help you avoid this kind of mistake in future, as well as give you a lot of other benefits.
Also, it is not a wise idea to paste example code over your own work in the way that you've done, for exactly the reason you've discovered. Insert a new file into your project, or create a separate project for testing example code. I have a separate Visual Studio solution specifically for this purpose.
EDIT: I say "probably" because I can't rule out all possibility of recovery based on the information you've supplied (e.g. you might have some kind of scheduled backup which caught your previous version). Also, if the code you pasted over it was shorter than your original code, it's possible that some of it still exists as unused data blocks on your hard drive, and might be recoverable, assuming something else hasn't already overwritten them.

Creating a dll with very large files

This is the first time I am writing in this forum, I hope someone could help me. I have been searching on the Web but have not found any answer related to my question.
I have a very large file (about 25000 lines) with thousands of definitions that must be used by another file
All these files (and about 600 more of them) are converted to .c files using a special tool. I am almost sure this conversion is made propertly.
If I create a.exe with all these files, there is no problem and everything works all right. Unfortunately, I need a .dll which crashes when I try to access to the very large file.
I have check that its .obj file is larger than 65MB so I have added the compiler command /bigobj as far as I have seen on the Internet but it didn't solve the problem.
I have also checked that the problem happens when access to the large file because everything works ok when I join both files (which is not possible in my development)
I am using Visual 2008
Could it be related to compile as C (/TC) or C++ (/TP) code? What's the difference between .exe and .dll that may make my program crashes?
Any ideas please?
Thanks in advance
Indeed, without the code not much can be said... (tho not sure if anyone would have the patience of reading 600 files each with 25k lines of code :) )
As advice, rebuild the exe and dll in debug mode, run the exe from MSVC, then put a breakpoint where you know it crashes. Next set a data breakpoint on the variable after you get its address from the watch window. ASSUMING the app does what it should correctly, then the pointer is set, but lost along the way; that means it should be triggered twice.
Alternatively, try an assertion check.
Another scenario is because the variable is volatile.
Another scenario is the value is returned from a temporary value and gets lost...
And last but not least, the value is never set because of wrong\bad conditions...
If your problem is the crash and not the missing value, just do a null check and return the call if you really want to avoid the complication, however, I would recommend you find why the value isn't set. Your choice.

Using MSBuild in VC++ 2010 to do custom preprocessing on files

I am trying to insert a custom preprocessor into the VC++ 2010 build-pipe after the regular preprocessor has finished, so far i figured that the way of doing this is via MSBuild.
To this point I wasn't able to find out much more, so my questions are:
Is this possible at all?
If so, what do I need to look at, to get going.
If you are talking about the c/c++ preprocessor, then you are probably out of luck. AFAIK, the preprocessor is built into the compiler itself. You can get the compiler to output a pre-processed file and then you MIGHT be able to send that through the compiler a second time to get the final output.
This may not work anyway due to the code being produced, at least in previous versions of cl.exe, doesn't seem to be 100% correct (white space gets mangled slightly which can cause errors).
If you want to take this path, what you'd need to do is have an MSBuild 'Target' that runs before the 'ClCompile' target.
This new target would have to run the 'cl.exe' program with all the settings that 'ClCompile' usually sends it with as well as the '/P' option which will "preprocess to a file".
Then you need to run your tool over the processed file and then finally feed these new files into 'ClCompile'.
If you need any more info just reply in the comments and I'll try to add some when I get the time (this question is rather old, so i'm not sure if it's worthwhile investing more time into this answer).

Include pdbs in installer?

Is there any reason to not include pdb files in an installer? I have C++ logging functionality that walks the stack, and reports line numbers and file names. It would be great if my customers could send me logs with this information. However, they would need the pdb files. Is there any downside (other than installer package size) to deploying them?
Two possible downsides:
The PDB file might make it easier for someone to reverse-engineer your application.
As a result of the previous, someone might come to expect to be able to call undocumented functions in your DLLs.
If those don't bother you, I can't see any downside. Note though that you don't really need this. As John Seigel says, you should be able to reconstruct the stack trace from a crash dump.
You should be able to achieve "line numbers and file names" without PDB files. Try using _FUNCTION_, _LINE_, and _FILE_. Read more here:
http://msdn.microsoft.com/en-us/library/b0084kay.aspx
Instead of shipping the PDB files, your error handling code can create minidumps. See function MiniDumpWriteDump. Minidumps are very small and can easily be send via e-mail.
If you get the dump file from the customer, only you need the PDB files.
IMHO, it is a very good idea to catch asserts or unexpected errors in your application, create a minidump automatically and let your application send this dump to you. If you get really fancy, you build yourself an automated bug tracking database in which these minidumps are stored. Then, you can find out which bugs are most common and need to be fixed first. Accidentally, you will find out a lot about the environment your application runs in. Which operating system versions are most common, which virus scanners hook into your application etc.
Obviously, this requires the consent of your users since the minidump may contain private information (however little information there is on the stack). It is not trivial to implement a working error handler that can catch, e.g., stack overflow exceptions.