per process configurable core dump directory - c++

Is there a way to configure the directory where core dump files are placed for a specific process?
I have a daemon process written in C++ for which I would like to configure the core dump directory. Optionally the filename pattern should be configurable, too.
I know about /proc/sys/kernel/core_pattern, however this would change the pattern and directory structure globally.
Apache has the directive CoreDumpDirectory - so it seems to be possible.

No, you cannot set it per process. The core file gets dumped either to the current working directory of the process, or the directory set in /proc/sys/kernel/core_pattern if the pattern includes a directory.
CoreDumpDirectory in apache is a hack, apache registers signal handlers for all signals that cause a core dump , and changes the current directory in its signal handler.
/* handle all varieties of core dumping signals */
static void sig_coredump(int sig)
{
apr_filepath_set(ap_coredump_dir, pconf);
apr_signal(sig, SIG_DFL);
#if AP_ENABLE_EXCEPTION_HOOK
run_fatal_exception_hook(sig);
#endif
/* linuxthreads issue calling getpid() here:
* This comparison won't match if the crashing thread is
* some module's thread that runs in the parent process.
* The fallout, which is limited to linuxthreads:
* The special log message won't be written when such a
* thread in the parent causes the parent to crash.
*/
if (getpid() == parent_pid) {
ap_log_error(APLOG_MARK, APLOG_NOTICE,
0, ap_server_conf,
"seg fault or similar nasty error detected "
"in the parent process");
/* XXX we can probably add some rudimentary cleanup code here,
* like getting rid of the pid file. If any additional bad stuff
* happens, we are protected from recursive errors taking down the
* system since this function is no longer the signal handler GLA
*/
}
kill(getpid(), sig);
/* At this point we've got sig blocked, because we're still inside
* the signal handler. When we leave the signal handler it will
* be unblocked, and we'll take the signal... and coredump or whatever
* is appropriate for this particular Unix. In addition the parent
* will see the real signal we received -- whereas if we called
* abort() here, the parent would only see SIGABRT.
*/
}

It is possible to make it using the "|command" mechanism of the core_pattern file. The executed command can create the directories and files as needed. The command can be passed the following specifiers in the parameters (cf. man 5 core):
%% a single % character
%c core file size soft resource limit of crashing process
%d dump mode—same as value returned by prctl(2) PR_GET_DUMPABLE
%e executable filename (without path prefix)
%E pathname of executable, with slashes ('/') replaced by exclamation marks ('!')
%g (numeric) real GID of dumped process
%h hostname (same as nodename returned by uname(2))
%i TID of thread that triggered core dump, as seen in the PID namespace in which the thread resides
%I TID of thread that triggered core dump, as seen in the initial PID namespace
%p PID of dumped process, as seen in the PID namespace in which the process resides
%P PID of dumped process, as seen in the initial PID namespace
%s number of signal causing dump
%t time of dump, expressed as seconds since the Epoch, 1970-01-01 00:00:00 +0000 (UTC)
%u (numeric) real UID of dumped process
For example, it possible to create a script (e.g. named crash.sh) as follow:
#!/bin/bash
# $1: process number on host side (%P)
# $2: program's name (%e)
OUTDIR=/tmp/core/$2
OUTFILE="core_$1"
# Create a sub-directory in /tmp
mkdir -p "$OUTDIR"
# Redirect stdin in a per-process file:
cat > "$OUTDIR"/"$OUTFILE"
exit 0
In the shell:
$ chmod +x crash.sh
$ mv crash.sh /tmp # Put the script in some place
$ sudo su
# echo '|/tmp/crash.sh %P %e' > /proc/sys/kernel/core_pattern
# cat /proc/sys/kernel/core_pattern
|/tmp/crash.sh %P %e
# exit
$
Create an example program which crashes (e.g. fail.c):
int main(void)
{
char *ptr = (char *)0;
*ptr = 'q';
return 0;
}
Compile the program (make several executables) and adjust the core file size in the current shell:
$ gcc fail.c -o fail1
$ gcc fail.c -o fail2
$ ulimit -c
0
$ ulimit -c unlimited
$ ulimit -c
unlimited
Run the failing programs several times to have multiple processes ids:
$ ./fail1
Segmentation fault (core dumped)
$ ./fail2
Segmentation fault (core dumped)
$ ./fail1
Segmentation fault (core dumped)
$ ./fail2
Segmentation fault (core dumped)
Look at /tmp where the core_pattern redirect the core dumps:
$ ls -l /tmp/core
total 8
drwxrwxrwx 2 root root 4096 nov. 3 15:57 fail1
drwxrwxrwx 2 root root 4096 nov. 3 15:57 fail2
$ ls -l /tmp/core/fail1/
total 480
-rw-rw-rw- 1 root root 245760 nov. 3 15:57 core_10606
-rw-rw-rw- 1 root root 245760 nov. 3 15:57 core_10614
$ ls -l /tmp/core/fail2
total 480
-rw-rw-rw- 1 root root 245760 nov. 3 15:57 core_10610
-rw-rw-rw- 1 root root 245760 nov. 3 15:57 core_10618

Related

Calling shell script from system c++ function making the shell script running as different user

I am using the system c++ call to execute the shell script the caller program is running as root but the shell sctipt which is called form the c++ code is running as different user.
How can I make sure the shell script should also run as root user like the c++ binary.
I don't want to rely on using sudo command as it can ask for password.
> [user#user ~]$ ll a.out temp.sh
> -rwsrwsr-x 1 root root 8952 Jun 14 13:16 a.out
> -rwxrwxr-x 1 user user 34 Jun 14 15:43 temp.sh
[user#user ~]$ cat temp.sh
#!/bin/bash read -n 1 -p "Hello"
[user#user ~]$ ps aux | grep temp
root 13247 0.0 0.0 13252 1540 pts/0 S+ 15:44 0:00 ./a.out ./temp.sh
user 13248 0.0 0.0 113152 2544 pts/0 S+ 15:44 0:00 /bin/bash ./temp.sh
c++ code
#include <bits/stdc++.h>
using namespace std;
int main(int argc, char *argv[])
{
system(argv[1]);
return 0;
}
A few bits of documentation to start:
From man 3 system's caveats section:
Do not use system() from a privileged program (a set-user-ID or set-group-ID program, or a program with capabilities) because strange values for some environment variables might be used to subvert system integrity. For example, PATH could be manipulated so that an arbitrary program is executed with privilege. Use the exec(3) family of functions instead, but not execlp(3) or execvp(3) (which also use the PATH environment variable to search for an executable).
system() will not, in fact, work properly from programs with set-user-ID or set-group-ID privileges on systems on which /bin/sh is bash version 2: as a security measure, bash 2 drops privileges on startup. Debian uses a different shell, dash(1), which does not do this when invoked as sh.)
And from the bash manual's description of the -p command line argument (Emphasis added):
Turn on privileged mode. In this mode, the $BASH_ENV and $ENV files are not processed, shell functions are not inherited from the environment, and the SHELLOPTS, BASHOPTS, CDPATH and GLOBIGNORE variables, if they appear in the environment, are ignored. If the shell is started with the effective user (group) id not equal to the real user (group) id, and the -p option is not supplied, these actions are taken and the effective user id is set to the real user id. If the -p option is supplied at startup, the effective user id is not reset. Turning this option off causes the effective user and group ids to be set to the real user and group ids.
So even if your /bin/sh doesn't drop privileges when run, bash will when it's run in turn without explicitly telling it not to.
So one option is to scrap using system(), and do a lower-level fork()/exec() of bash -p your-script-name.
Some other approaches to allowing scripts to run at elevated privileges are mentioned in Allow suid on shell scripts. In particular the answer using setuid() to change the real UID looks like it's worth investigating.
Or configure sudo to not require a password for a particular script for a given user.
Also see Why should I not #include <bits/stdc++.h>?

abrt - use event to copy/move coredump to custom location

I cannot seem to find a way to configure my abrt event to copy the coredump to a custom location. The reason I want to do this is to prevent abrt from pruning my coredumps if the crash directory exceeds MaxCrashReportsSize. With the prerequisite that I have no control over how abrt is configured I would like to export the coredump to a support directory as soon as it is created.
EVENT=post-create pkg_name=raptorio analyzer=CCpp
test -f coredump && { mkdir -p /opt/raptorio/cores; cp -f coredump /opt/raptorio/cores/$(basename `cat executable`).core; }
This event will save one coredump for each C/C++ binary from my raptorio RPM package. When my program crashes abrt prints the following errors in the syslog:
Aug 30 08:28:41 abrtd: mkdir: cannot create directory `/opt/raptorio/cores': Permission denied
Aug 30 08:28:41 abrtd: cp: cannot create regular file `/opt/raptorio/cores/raptord.core': No such file or directory
Aug 30 08:28:41 abrtd: 'post-create' on '/var/spool/abrt/ccpp-2016-08-30-08:28:10-31213' exited with 1
I see that the abrt event runs as root:root but it is jailed somehow, possibly due to SELinux? I am using abrt 2.0.8 on centos 6.
/opt is not the right place to keep transient files. cores should go in /var/raptorio/cores, perhaps. See the Filesystem Hierarchy Standard
Assuming your program runs as user 'nobody', make sure 'nobody' has write permissions on that directory, and you should be all set.

Forked and executed program does not returns to console

I took example program from Advanced Linux Programming site:
/***********************************************************************
* Code listing from "Advanced Linux Programming," by CodeSourcery LLC *
* Copyright (C) 2001 by New Riders Publishing *
* See COPYRIGHT for license information. *
***********************************************************************/
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>
/* Spawn a child process running a new program. PROGRAM is the name
of the program to run; the path will be searched for this program.
ARG_LIST is a NULL-terminated list of character strings to be
passed as the program's argument list. Returns the process id of
the spawned process. */
int spawn (char* program, char** arg_list)
{
pid_t child_pid;
/* Duplicate this process. */
child_pid = fork ();
if (child_pid != 0)
/* This is the parent process. */
return child_pid;
else {
/* Now execute PROGRAM, searching for it in the path. */
execvp (program, arg_list);
/* The execvp function returns only if an error occurs. */
fprintf (stderr, "an error occurred in execvp\n");
abort ();
}
}
int main ()
{
/* The argument list to pass to the "ls" command. */
char* arg_list[] = {
"ls", /* argv[0], the name of the program. */
"-l",
"/",
NULL /* The argument list must end with a NULL. */
};
/* Spawn a child process running the "ls" command. Ignore the
returned child process id. */
spawn ("ls", arg_list);
printf ("done with main program\n");
return 0;
}
After compiling and running it from the console, the child process does not exit, thus it doesn't release the console.
Only Ctrl+C helps to return to console.
vladon#vladon-dev-mint64 ~/Projects/test $ gcc -o test test.c
vladon#vladon-dev-mint64 ~/Projects/test $ ./test
done with main program
vladon#vladon-dev-mint64 ~/Projects/test $ total 104
drwxr-xr-x 2 root root 4096 Mar 11 11:57 bin
drwxr-xr-x 3 root root 4096 Mar 11 11:57 boot
[ ... too many lines of my filesystem skipped ... ]
drwxr-xr-x 10 root root 4096 Nov 27 01:12 usr
drwxr-xr-x 11 root root 4096 Nov 27 01:48 var
^C
vladon#vladon-dev-mint64 ~/Projects/test $
How can I run another program and exit to console back?
The first program completed, without waiting for the child process to complete. The shell gave you a prompt, but then the output of the ls -l command started.
The shell was still waiting for you when you hit the interrupt; if you'd typed echo Hi, it would have done your bidding.
Here's your sample output, annotated:
vladon#vladon-dev-mint64 ~/Projects/test $ gcc -o test test.c
vladon#vladon-dev-mint64 ~/Projects/test $ ./test
done with main program
vladon#vladon-dev-mint64 ~/Projects/test $ total 104
The previous line has your prompt, and also the first line of output from ls -l.
drwxr-xr-x 2 root root 4096 Mar 11 11:57 bin
drwxr-xr-x 3 root root 4096 Mar 11 11:57 boot
[ ... too many lines of my filesystem skipped ... ]
drwxr-xr-x 10 root root 4096 Nov 27 01:12 usr
drwxr-xr-x 11 root root 4096 Nov 27 01:48 var
^C
If you'd typed echo Hi instead of Control-C, you'd have seen Hi and the next prompt. Just like you got the next prompt after interrupting the shell…
vladon#vladon-dev-mint64 ~/Projects/test $

Force core dump on RHEL 6

How do I force a process to core dump on RHEL 6?
I tried kill -3 , but the process is still running.
kill -SIGSEGV kills the process, but no core is generated :
terminate called after throwing an instance of 'omni_thread_fatal'
EVServices: ./../../../rw/db/dbref.h:251: T *RWDBCountedRef<T>::operator->() const [with T = RWDBHandleImp]: Assertion `(impl_) != 0' failed.
/evaluate/ev_dev87/shl/StartProcess.sh[69]: wait: 35225: Killed
Thu Dec 5 11:14:03 EST 2013 Exited EVServices, pid=35225, with ERROR returncode=265 signal=SIGKILL
Please tell me what else I can try to force a process to core.
Use SIGABRT to generate a core dump: kill -6 <pid>
This requires the running process to be allowed to write core dumps, issue ulimit -c unlimited in the same shell as the one used to run your program, before running that program.

how to know which statement the running process is executing

I have a process which suddenly hanged and is not giving any core dump and is also not killed.i can see it still running using the ps command.
how can i know which statement it is currently executing inside the code.
basically i want to know where exactly it got hanged.
language is c++ and platform is solaris unix.
demos.283> cat test3.cc
#include<stdio.h>
#include<unistd.h>
int main()
{
sleep(100);
return 0;
}
demos.284> CC test3.cc
demos.285> ./a.out &
[1] 2231
demos.286> ps -o "pid,wchan,comm"
PID WCHAN COMMAND
23420 fffffe86e9a5aff6 -tcsh
2345 - ps
2231 ffffffffb8ca3376 ./a.out
demos.290> ps
PID TTY TIME CMD
3823 pts/36 0:00 ps
23420 pts/36 0:00 tcsh
3822 pts/36 0:00 a.out
demos.291> pstack 3822
3822: ./a.out
fed1a215 nanosleep (80478c0, 80478c8)
080508ff main (1, 8047920, 8047928, fed93ec0) + f
0805085d _start (1, 8047a4c, 0, 8047a54, 8047a67, 8047c05) + 7d
demos.292>
You have several options: the easiest is to check the WCHAN wait channel that the process is sleeping on:
$ ps -o "pid,wchan,comm"
PID WCHAN COMMAND
2350 wait bash
20639 hrtime i3status
20640 poll_s dzen2
28821 - ps
This can give you a good indication of what the process is doing and is very easy to get.
You can use ktruss and ktrace or DTrace to trace your process. (Sorry, no Solaris here, so no examples.)
You can also attach gdb(1) to your process:
# gdb -p 20640
GNU gdb (Ubuntu/Linaro 7.2-1ubuntu11) 7.2
...
(gdb) bt
#0 0x00007fd1a99fd123 in __select_nocancel () at ../sysdeps/unix/syscall-template.S:82
#1 0x0000000000405533 in ?? ()
#2 0x00007fd1a993deff in __libc_start_main (main=0x4043e3, argc=13, ubp_av=0x7fff25e7b478,
...
The backtrace is often the single most useful error report you can get from a process, so it is worth installing gdb(1) if it isn't already installed. gdb(1) can do a lot more than just show you backtraces, but a full tutorial is well outside the scope of Stack Overflow.
you can try with pstack passing pid as parameter. You can use ps to get the process id (pid)
For example: pstack 1267