I am running gdb on a very large project C++, something like 500k sloc's, 300k symbols.
Whenever I mistype a TAB, gdb will search for all those symbols, then present the helpful "Display all XX possibilities?"
Problem is that the search takes over a 1min, and gets mem usage up to over 4gb (not going down after that).
There must be a way to stop this behaviour in gdb. Can I disable the TAB for symbol resolution/completion?
Or limit the number of loaded/searched symbols? Or at least kill that search when it starts? Ctrl+C or anything doesn't work.
I'm using gdb 7.7
Can I disable the TAB for symbol resolution/completion?
You can disable all tab-completion (for all programs using GNU readline, including GDB) by putting this:
set disable-completion on
into ~/.inputrc. Documentation here.
I don't believe there is a way to disable tab-completion for symbols only.
Related
I have a gdb backtrace of a crashed process, but I can't see the specific line in which the crash occurred because the source code was not in that moment. I don't understand some of the information given by the mentioned backtrace.
The backtrace is made of lines like the following one:
<path_to_binary_file>(_Z12someFunction+0x18)[0x804a378]
Notice that _Z12someFunction is the mangled name of int someFunction(double ).
My questions are:
Does the +0x18 indicate the offset, starting at _Z12someFunction address, of the assembly instruction that produced the crash?
If the previous question is affirmative, and taking into account that I am working with a 32-bit architecture, does the +0x18 indicates 0x18 * 4 bytes?
If the above is affirmative, I assume that the address 0x804a378 is the _Z12someFunction plus 0x18, am I right?
EDIT:
The error has ocurred in a production machine (no cores enabled), and it seems to be a timing-dependant bug, so it is not easy to reproduce it. That is because the information I am asking for is important to me in this occasion.
Most of your assumptions are correct. The +0x18 indeed means offset (in bytes, regardless of architecture) into the executable.
0x804a378 is the actual address in which the error occurred.
With that said, it is important to understand what you can do about it.
First of all, compiling with -g will produce debug symbols. You, rightfully, strip those for your production build, but all is not lost. If you take your original executable (i.e. - before you striped it), you can run:
addr2line -e executable
You can then feed into stdin the addresses gdb is giving you (0x804a378), and addr2line will give you the precise file and line to which this address refers.
If you have a core file, you can also load this core file with the unstriped executable, and get full debug info. It would still be somewhat mangled, as you're probably building with optimizations, but some variables should, still, be accessible.
Building with debug symbols and stripping before shipping is the best option. Even if you did not, however, if you build the same sources again with the same build tools on the same environment and using the same build options, you should get the same binary with the same symbols locations. If the bug is really difficult to reproduce, it might be worthwhile to try.
EDITED to add
Two more important tools are c++filt. You feed it a mangled symbol, and produces the C++ path to the actual source symbol. It works as a filter, so you can just copy the backtrace and paste it into c++filt, and it will give you the same backtrace, only more readable.
The second tool is gdb remote debugging. This allows you to run gdb on a machine that has the executable with debug symbols, but run the actual code on the production machine. This allows live debugging in production (including attaching to already running processes).
You are confused. What you are seeing is backtrace output from glibc's backtrace function, not gdb's backtrace.
but I can't see the specific line in which the crash occurred because
the source code was not in that moment
Now you can load executable in gdb and examine the address 0x804a378 to get line numbers. You can use list *0x804a378 or info symbol 0x804a378. See Convert a libc backtrace to a source line number and How to use addr2line command in linux.
Run man gcc, there you should see -g option that gives you possibility to add debug information to the binary object file, so when crash happens and the core is dumped gdb can detect exact lines where and why the crash happened, or you can run the process using gdb or attach to it and see the trace directly without searching for the core file.
I'm using gcc 4.9.2 & gdb 7.2 in Solaris 10 on sparc. The following was tested after compiling/linking with -g, -ggdb, and -ggdb3.
When I attach to a process:
~ gdb
/snip/
(gdb) attach pid_goes_here
... it is not loading symbolic information. I started with netbeans which starts gdb without specifying the executable name until after the attach occurs, but I've eliminated netbeans as the cause.
I can force it to load the symbol table under netbeans if I do one of the following:
Attach to the process, then in the debugger console do one of the following:
(gdb) detach
(gdb) file /path/to/file
(gdb) attach the_pid_goes_here
or
(gdb) file /path/to/file
(gdb) sharedlibrary .
I want to know if there's a more automatic way I can force this behavior. So far googling has turned up zilch.
I want to know if there's a more automatic way I can force this behavior.
It looks like a bug.
Are you sure that the main executable symbols are loaded? This bug says that attach pid without giving the binary doesn't work on Solaris at all.
In any case, it's supposed to work automatically, so your best bet to make it work better is probably to file a bug, and wait for it to be fixed (or send a patch to fix it yourself :-)
I've configure all CONFIG_DEBUG_ related options to y,but when I try to debug the kernel,it says no debug symbols found:
gdb /usr/src/linux-2.6.32.9/vmlinux /proc/kcore
Reading symbols from /usr/src/linux-2.6.32.9/vmlinux...(no debugging symbols found)...done.
Why?
Here is my best guess so far: I don't know, and it doesn't matter.
I don't know why GDB is printing the message "(no debugging symbols found)". I've actually seen this when building my own kernels. I configure a kernel to use debug symbols, but GDB still prints this message when it looks at the kernel image. I never bothered to look into it, because my image can still be debugged fine. Despite the message, GDB can still disassemble functions, add breakpoints, look up symbols, and single-step through functions. I never noticed a lack of debugging functionality. I'm guessing that the same thing is happening to you.
Edit: Based on the your comments to the question, it looks like you were searching for the wrong symbol with your debugger. System call handlers start with a prefix of sys_, but you can't tell from looking at the code. The macro SYSCALL_DEFINE4(ptrace, ...) just ends up declaring the function as asmlinkage long sys_ptrace(...), although it does some other crazy stuff if you have ftrace enabled.
make menuconfig->kernel hacking->[]Kernel debugging->[]Compile the kernel with debug info(CONFIG_DEBUG_INFO)
It's also possible when you package your vmlinuz image, the debug symbols were stripped (when using make-kpkg to build deb package for linux kernel). So you have to use the built vmlinux file under your linux source tree to have those debug symbols.
Add -g to the CFLAGS variable in the kernel Makefile
I might be wrong, but I thought you would have to install the debuginfo package for your kernel to get symbols
I'm debugging a nasty problem where #includeing a file(not anything I wrote, for the record) causes a crash in my program. This means, I have working and broken, with only one C(++) include statement changed. Some of the libraries I'm using don't have debugging information.
What I would like to do is get GDB to output every line of C++ executed for the program run, and x86 instructions where not available to a textfile in such a format that I can diff the two outputs and hopefully figure out what went wrong.
Is this easily possible in GDB?
You can check the difference between the pre-processed output in each version. For example:
gcc -dD -E a.cc -o a.pre
gcc -dD -E b.cc -o b.pre
diff -u a.pre b.pre
You can experiment with different "-d" settings to make that more verbose/concise. Maybe something in the difference of listings will be obvious. It's usually something like a struct which changes size depending on include files.
Failing that, if you really want to mess with per-instruction or line traces, you could probably use valgrind and see where the paths diverge, but I think you may be in for a world of pain. In fact you'll probably find valgrind finds your bug and then 100 you didn't know about :) I expect the problem is just a struct or other data size difference, and you won't need to bother.
You could get gdb to automate line tracing. It would be quite painful. Basically you'd need to script it to run "n" (next line) repeatedly until a crash, then check the logs. If you can script "b main", then "run", then infinite "n" that would do it. There's probably a built-in command to do it but I'm not aware of it.
I don't think GDB can do this; maybe a profile will help, though? Are you compiling with gcc? Look at the -p and -pf commands, I think those might be useful.
The disassemble command at the gdb prompt will disassemble the current function you are stopped in, but I don't think outputting the entire execution path is feasible.
What library are you including? If it is open source, you can recompile it with debugging symbols enabled. Also, if you're using Linux, most distributions have -dbg versions of packages for common libraries.
I'm using Linux redhat 3, can someone explain how is that possible that i am able to analyze
with gdb , a core dump generated in Linux redhat 5 ?
not that i complaint :) but i need to be sure this will always work... ?
EDIT: the shared libraries are the same version, so no worries about that, they are placed in a shaerd storage so it can be accessed from both linux 5 and linux 3.
thanks.
You can try following commands of GDB to open a core file
gdb
(gdb) exec-file <executable address>
(gdb) set solib-absolute-prefix <path to shared library>
(gdb) core-file <path to core file>
The reason why you can't rely on it is because every process used libc or system shared library,which will definitely has changes from Red hat 3 to red hat 5.So all the instruction address and number of instruction in native function will be diff,and there where debugger gets goofed up,and possibly can show you wrong data to analyze. So its always good to analyze the core on the same platform or if you can copy all the required shared library to other machine and set the path through set solib-absolute-prefix.
In my experience analysing core file, generated on other system, do not work, because standard library (and other libraries your program probably use) typically will be different, so addresses of the functions are different, so you cannot even get a sensible backtrace.
Don't do it, because even if it works sometimes, you cannot rely on it.
You can always run gdb -c /path/to/corefile /path/to/program_that_crashed. However, if program_that_crashed has no debug infos (i.e. was not compiled and linked with the -g gcc/ld flag) the coredump is not that useful unless you're a hard-core debugging expert ;-)
Note that the generation of corefiles can be disabled (and it's very likely that it is disabled by default on most distros). See man ulimit. Call ulimit -c to see the limit of core files, "0" means disabled. Try ulimit -c unlimited in this case. If a size limit is imposed the coredump will not exceed the limit size, thus maybe cutting off valuable information.
Also, the path where a coredump is generated depends on /proc/sys/kernel/core_pattern. Use cat /proc/sys/kernel/core_pattern to query the current pattern. It's actually a path, and if it doesn't start with / then the file will be generated in the current working directory of the process. And if cat /proc/sys/kernel/core_uses_pid returns "1" then the coredump will have the file PID of the crashed process as file extension. You can also set both value, e.g. echo -n /tmp/core > /proc/sys/kernel/core_pattern will force all coredumps to be generated in /tmp.
I understand the question as:
how is it possible that I am able to
analyse a core that was produced under
one version of an OS under another
version of that OS?
Just because you are lucky (even that is questionable). There are a lot of things that can go wrong by trying to do so:
the tool chains gcc, gdb etc will
be of different versions
the shared libraries will be of
different versions
so no, you shouldn't rely on that.
You have asked similar question and accepted an answer, ofcourse by yourself here : Analyzing core file of shared object
Once you load the core file you can get the stack trace and get the last function call and check the code for the reason of crash.
There is a small tutorial here to get started with.
EDIT:
Assuming you want to know how to analyse core file using gdb on linux as your question is little unclear.