How do I force a process to core dump on RHEL 6?
I tried kill -3 , but the process is still running.
kill -SIGSEGV kills the process, but no core is generated :
terminate called after throwing an instance of 'omni_thread_fatal'
EVServices: ./../../../rw/db/dbref.h:251: T *RWDBCountedRef<T>::operator->() const [with T = RWDBHandleImp]: Assertion `(impl_) != 0' failed.
/evaluate/ev_dev87/shl/StartProcess.sh[69]: wait: 35225: Killed
Thu Dec 5 11:14:03 EST 2013 Exited EVServices, pid=35225, with ERROR returncode=265 signal=SIGKILL
Please tell me what else I can try to force a process to core.
Use SIGABRT to generate a core dump: kill -6 <pid>
This requires the running process to be allowed to write core dumps, issue ulimit -c unlimited in the same shell as the one used to run your program, before running that program.
Related
In macOS, I find that SIGABRT won't generate core dumps in some cases.
For example, I run a sleep in one terminal:
lianxin.wlx#mbp [01:08:21] [~/test]
-> % sleep 1000
And send a SIGABRT to it in another terminal:
lianxin.wlx#mbp [01:08:59] [~]
-> % ps -ef | grep sleep
502 47679 20388 0 1:08AM ttys001 0:00.01 sleep 1000
lianxin.wlx#mbp [01:09:03] [~]
-> % kill -6 47679
Then the sleep process is aborted, but no core dump is generated.
lianxin.wlx#mbp [01:08:21] [~/test]
-> % sleep 1000
[1] 47679 abort sleep 1000
lianxin.wlx#mbp [01:10:35] [~/test]
-> % ls /cores
lianxin.wlx#mbp [01:10:37] [~/test]
-> %
So why? I've tested the same operations in Linux, it did generate a core dump.
I'm sure I've opened the core dump right(ulimit -c unlimited, and /cores's privilege is 777). I wrote a program that will crash with SIGSEGV, and it did generate a core dump in /cores.
If you make a simple program,
main() {
abort();
}
It will generate a core dump if run with appropriate priv.
Also, if you make a:
main() {
sleep(100);
}
run it in the background and kill -ABRT , it will generate a core dump.
But /bin/sleep doesn't, which is a bit odd.
This is assuming you have followed the recipe in man core.
I am attaching a running process in linux platform using gdb.
gdb -p <pid>
Setting the breakpoint, say
b test.cpp : 23
Now continuing
c
After this when i perform some operation destined to attached process, getting the below error in gdb console:
(gdb) c
Continuing.
[New Thread 0x7409fb70 (LWP 8530)]
Cannot get thread event message: debugger service failed
Please help me how to proceed further.
Additional details:
Its icc compiler, compiling with -g flag enabled
I needed to debug a program asynchronously, because it stalled, and Ctrl+C killed gdb, rather than interrupting the program (this is on MinGW/MSYS).
Someone hinted that gdb wouldn't work on Windows in async mode, and indeed it didn't (with the Asynchronous execution not supported on this target. message), but that gdbserver would.
So I try:
$ gdbserver localhost:60000 ./a_.exe 0
Process ./a_.exe created; pid = 53644
Listening on port 60000
(Supplying the 0 as the argument, according to how the manpage says it's done.)
Then in another terminal:
$ gdb ./a_.exe
(gdb) target remote localhost:60000
Remote debugging using localhost:60000
0x76fa878f in ntdll!DbgBreakPoint () from C:\Windows\system32\ntdll.dll
(gdb) continue
Continuing.
[Inferior 1 (Remote target) exited with code 01]
While the original now looks like:
$ gdbserver localhost:60000 ./a_.exe 0
Process ./a_.exe created; pid = 53484
Listening on port 60000
Remote debugging from host 127.0.0.1
Expecting 1 argument: test case number to run.
Child exited with status 1
GDBserver exiting
That is, my program thought that it got no arguments.
Is the manpage wrong?
Yes! "Misleading" is a more fitting term. (Misleading, at least as it applies to this version of gdbserver on this platform.)
The first argument is literally the first argument (argv) given to the inferior. Normally this is the name of the executable. So, the following worked:
$ gdbserver localhost:60000 ./a_.exe whatever 0
That is, the manpage should have said, to be consistent:
target> gdbserver host:2345 emacs emacs foo.txt
I am not able to help you out with gdbserver,
but if you are just looking for a way to interrupt a program running in mingw/msys gdb(Similar to Ctrl+C on linux)
take a look at at Debugbreak
Is it possible to run a process with gdb, modify some memory and then detach from the process afterwards?
I can't start the process from outside of gdb as I need to modify the memory, before the first instruction is executed.
When you detach from a process started with gdb, gdb will hang, but killing gdb from another process makes the debugged process still running.
I currently use the following script to launch the process:
echo '# custom gdb function that finds the entry_point an assigns it to $entry_point_address
entry_point
b *$entry_point_address
run
set *((char *)0x100004147) = 0xEB
set *((char *)0x100004148) = 0xE2
detach # gdb hangs here
quit # quit never gets executed
' | gdb -quiet "$file"
This happens in both of my gdb versions:
GNU gdb 6.3.50-20050815 (Apple version gdb-1824)
GNU gdb 6.3.50-20050815 (Apple version gdb-1822 + reverse.put.as patches v0.4)
I'm pretty sure that you can't detach from an inferior processes that was started directly under gdb, however, something like the following might work for you, this is based on a recent gdb, I don't know how much of this will work on version 6.3.
Create a small shell script, like this:
#! /bin/sh
echo $$
sleep 10
exec /path/to/your/program arg1 arg2 arg3
Now start this up, spot the pid from echo $$, and attach to the shell script like this gdb -p PID. Once attached you can:
(gdb) set follow-fork-mode child
(gdb) catch exec
(gdb) continue
Continuing.
[New process NEW-PID]
process NEW-PID is executing new program: /path/to/your/program
[Switching to process NEW-PID]
Catchpoint 1 (exec'd /path/to/your/program), 0x00007f40d8e9fc80 in _start ()
(gdb)
You can now modify the child process as required. Once you're finished just do:
(gdb) detach
And /path/to/your/program should resume (or start in this case) running.
I have a process which suddenly hanged and is not giving any core dump and is also not killed.i can see it still running using the ps command.
how can i know which statement it is currently executing inside the code.
basically i want to know where exactly it got hanged.
language is c++ and platform is solaris unix.
demos.283> cat test3.cc
#include<stdio.h>
#include<unistd.h>
int main()
{
sleep(100);
return 0;
}
demos.284> CC test3.cc
demos.285> ./a.out &
[1] 2231
demos.286> ps -o "pid,wchan,comm"
PID WCHAN COMMAND
23420 fffffe86e9a5aff6 -tcsh
2345 - ps
2231 ffffffffb8ca3376 ./a.out
demos.290> ps
PID TTY TIME CMD
3823 pts/36 0:00 ps
23420 pts/36 0:00 tcsh
3822 pts/36 0:00 a.out
demos.291> pstack 3822
3822: ./a.out
fed1a215 nanosleep (80478c0, 80478c8)
080508ff main (1, 8047920, 8047928, fed93ec0) + f
0805085d _start (1, 8047a4c, 0, 8047a54, 8047a67, 8047c05) + 7d
demos.292>
You have several options: the easiest is to check the WCHAN wait channel that the process is sleeping on:
$ ps -o "pid,wchan,comm"
PID WCHAN COMMAND
2350 wait bash
20639 hrtime i3status
20640 poll_s dzen2
28821 - ps
This can give you a good indication of what the process is doing and is very easy to get.
You can use ktruss and ktrace or DTrace to trace your process. (Sorry, no Solaris here, so no examples.)
You can also attach gdb(1) to your process:
# gdb -p 20640
GNU gdb (Ubuntu/Linaro 7.2-1ubuntu11) 7.2
...
(gdb) bt
#0 0x00007fd1a99fd123 in __select_nocancel () at ../sysdeps/unix/syscall-template.S:82
#1 0x0000000000405533 in ?? ()
#2 0x00007fd1a993deff in __libc_start_main (main=0x4043e3, argc=13, ubp_av=0x7fff25e7b478,
...
The backtrace is often the single most useful error report you can get from a process, so it is worth installing gdb(1) if it isn't already installed. gdb(1) can do a lot more than just show you backtraces, but a full tutorial is well outside the scope of Stack Overflow.
you can try with pstack passing pid as parameter. You can use ps to get the process id (pid)
For example: pstack 1267