Linux kill() error unexpected - c++

Kill(pid, 0) seems to not set the error code correctly...as stated in man for kill
Errors
The kill() function shall fail if:
EINVAL The value of the sig argument is an invalid or unsupported
signal number.
EPERM The process does not have permission to send the
signal to any receiving process.
ESRCH No process or process group can
be found corresponding to that specified by pid. The following
sections are informative.
1
It is returning ENOENT (no such file or directory) and then sometimes it returns EINTR (system call interrupted)...
Here is what I am doing:
kill(g_StatusInstance[i].pid, SIGTERM) == -1 && log_fatal_syscall("kill-sigterm");
kill(g_StatusInstance[i].pid, 0);
log_info_console( "Checking process for errors: %s\n", strerror(errno));
if(errno != ENOENT)
{
kill(g_StatusInstance[i].pid, SIGKILL) == -1 && log_fatal_syscall("kill-sigkill");
}
Am I doing something wrong?

Kill(pid, 0) seems to not set the error code correctly ...
It is returning ENOENT... EINTR
Here is what I am doing:
...
kill(g_StatusInstance[i].pid, 0);
log_info_console( "Checking process for errors: %s\n", strerror(errno));
Am I doing something wrong?
Yes. You are not checking the return value of the kill() system call. kill() does not set errno to any particular value in the successful case.
Try this:
if(kill(g_StatusInstance[i].pid, 0) == -1) {
log_info_console( "Checking process for errors: %s\n", strerror(errno));
} else {
log_info_console( "kill returned 0, process still alive\n" );
}
More generally, you ought to check the return value of every system call or library call, unless it is declared to return void.

Based on the discussion, your question is likely "Why did my kill() not generate the effect that I expected?"
In order to understand why that is happening, you should first try strace on the process which is the target of the kill(). Attach it to your existing process by pid or invoke it under strace. strace will show modifications to the signal mask and indicate when signals arrive. If your signal is arriving, you should debug the process targeted by the kill() and try to understand what the installed/default signal handler is expected to do.

Related

how to get process status ( running , killed ) event?

How to I get the status of another process?
i want to know the execution status of another process.
i want to receive and process the event as a inotify.
no search /proc by periods.
how to another process status (running , killed ) event?
SYSTEM : linux, solaris, aix
Linux
Under Linux (and probably many Unixes system) you can achieve this by using the ptrace call, then using waitpid to wait for status:
manpages:
ptrace call: http://man7.org/linux/man-pages/man2/ptrace.2.html
waitpid call: https://linux.die.net/man/2/waitpid
From the manpage:
Death under ptrace
When a (possibly multithreaded) process receives a killing signal
(one whose disposition is set to SIG_DFL and whose default action is
to kill the process), all threads exit. Tracees report their death
to their tracer(s). Notification of this event is delivered via
waitpid(2).
beware that you will need to have special authorization in certain cases. Take a look at /proc/sys/kernel/yama/ptrace_scope. (if you can modify the target program, you can also change the behavior of ptrace by calling ptrace(PTRACE_TRACEME, 0, nullptr, nullptr);
To use ptrace, first you must get your process PID, then call PTRACE_ATTACH:
// error checking removed for the sake of clarity
#include <sys/ptrace.h>
pid_t child_pid;
// ... Get your child_pid somehow ...
// 1. attach to your process:
long err;
err = ptrace(PTRACE_ATTACH, child_pid, nullptr, nullptr);
// 2. wait for your process to stop:
int process_status;
err = waitpid(child_pid, &process_status, 0);
// 3. restart the process (continue)
ptrace(PTRACE_CONT, child_pid, nullptr, nullptr);
// 4. wait for any change in status:
err = waitpid(child_pid, &process_status, 0);
// while waiting, the process is running...
// by default waitpid will wait for process to terminate, but you can
// change this with WNOHANG in the options.
if (WIFEXITED(status)) {
// exitted
}
if (WIFSIGNALED(status)) {
// process got a signal
// WTERMSIG(status) will get you the signal that was sent.
}
AIX:
The solution will need some adaptation to work with AIX, have a look at the doc there:
ptrace documentation: https://www.ibm.com/support/knowledgecenter/en/ssw_aix_72/com.ibm.aix.basetrf1/ptrace.htm
waitpid documentation: https://www.ibm.com/support/knowledgecenter/en/ssw_aix_72/com.ibm.aix.basetrf1/ptrace.htm
Solaris
As mentionned here ptrace may not be available on your version of Solaris, you may have to resort to procfs there.

how waitpid() function is implemented in system() function in linux

I was going through Richard Stevens"Advanced Programming in UNIX Environment" and I found this topic.
*8.13. system Function
*****Because system is implemented by calling fork, exec, and waitpid, there are three types of return values.**
1. If either the fork fails or waitpid returns an error other than EINTR, system returns –1 with errno set to indicate the error.
2. If the exec fails, implying that the shell can't be executed, the return value is as if the shell had executed exit(127).
**3. Otherwise, all three functions—fork, exec, and waitpid—succeed, and the return value from system is the termination status of the shell, in the format specified for waitpid.******
As of my understanding we fork() a process by the cmdstring name and exec() makes it separate from the parent process.
But unable to figure out how waitpid() function is a part of system() function call?
The below link ambiguous constructor call while object creation didn't provide me correct answer.
After you fork() off, your original process continues immediately, i.e. fork() returns at once. At that point, the new process is still running. Since system() is supposed to be synchronous, i.e. must only return after the executed program finishes, the original program now needs to call waitpid() on the PID of the new process to wait for its termination.
In a picture:
[main process]
.
.
.
fork() [new process]
A
/ \
| \
| \___ exec()
waitpid() .
z .
z . (running)
z .
z Done!
z |
+----<----<---+
|
V
(continue)
The system() call would, in a Unix environment look something like this:
int system(const char *cmd)
{
int pid = fork();
if(!pid) // We are in the child process.
{
// Ok, so it's more complicated than this, it makes a new string with a
// shell in it, etc.
exec(cmd);
exit(127); // exec failed, return 127. [exec doesn't return unless it failed!]
}
else
{
if (pid < 0)
{
return -1; // Failed to fork!
}
int status;
if (waitpid(pid, &status, 0) > 0)
{
return status;
}
}
return -1;
}
Please do note that this is SYMBOLICALLY what system does - it's a fair bit more complicated, because waitpid can give other values, and all sorts of other things that need checking.
From the man pages:
system() executes a command specified in command by calling /bin/sh -c command, and returns after the command has been completed. During execution of the command, SIGCHLD will be blocked, and SIGINT and SIGQUIT will be ignored.
system() presumably uses waitpid() to wait until the shell command finishes.

Determine if a process has suspended

I try to send a SIGTSTP signal to a particular process, but how to determine if the process has actually suspended using C library functions or syscalls in Linux?
Read from /proc/[pid]/stat.
From the man page, you can get the status of a process from this file:
state %c
One character from the string "RSDZTW" where R is running, S is
sleeping in an interruptible wait, D is waiting in uninterruptible
disk sleep, Z is zombie, T is traced or stopped (on a signal), and W
is paging.
I know this is an old post, but for anyone who as curious as me!
The simple answer is that there is only one STATIC, consistent way to check status, which is from /proc/[pid]/stat, BUT if you want to have as few architecture dependencies as possible and don't want to do that, you can check the signal.
Signals can only be seen once, so you'll have to keep track of it yourself, but waitpid can tap a process to see if any signals have been received since you last checked:
BOOL is_suspended;
int status;
pid_t result = waitpid(pid, &status, WNOHANG | WUNTRACED | WCONTINUED);
if(result > 0) { // Signal has been received
if (WIFSTOPPED(status)) {
is_suspended = true;
} else if (WIFCONTINUED(status)) {
is_suspended = false;
}
}

System call interrupted by a signal still has to be completed

A lot of system calls like close( fd ) Can be interrupted by a signal. In this case usually -1 is returned and errno is set EINTR.
The question is what is the right thing to do? Say, I still want this fd to be closed.
What I can come up with is:
while( close( fd ) == -1 )
if( errno != EINTR ) {
ReportError();
break;
}
Can anybody suggest a better/more elegant/standard way to handle this situation?
UPDATE:
As noticed by mux, SA_RESTART flag can be used when installing the signal handler.
Can somebody tell me which functions are guaranteed to be restartable on all POSIX systems(not only Linux)?
Some system calls are restartable, which means the kernel will restart the call if interrupted, if the SA_RESTART flag is used when installing the signal handler, the signal(7) man page says:
If a blocked call to one of the following interfaces is interrupted
by a signal handler, then the call will be automatically restarted
after the signal
handler returns if the SA_RESTART flag was used; otherwise the call will fail with the error EINTR:
It doesn't mention if close() is restartable, but these are:
read(2), readv(2), write(2), writev(2), ioctl(2), open(2),wait(2),
wait3(2), wait4(2), waitid(2), and waitpid,accept(2), connect(2),
recv(2), recvfrom(2), recvmsg(2), send(2), sendto(2), and sendmsg(2)
flock(2) and fcntl(2) mq_receive(3), mq_timedreceive(3), mq_send(3),
and mq_timedsend(3) sem_wait(3) and sem_timedwait(3) futex(2)
Note that those details, specifically the list of non-restartable calls, are Linux-specific
I posted a relevant question about which system calls are restartable and if it's specified by POSIX somewhere, it is specified by POSIX but it's optional, so you should check the list of non-restartable calls for your OS, if it's not there it should be restartable. This is my question:
How to know if a Linux system call is restartable or not?
Update: Close is a special case it's not restartable and should not be retried in Linux, see this answer for more details:
https://stackoverflow.com/a/14431867/1157444
Assuming you're after shorter code, you can try something like:
while (((rc = close (fd)) == -1) && (errno == EINTR));
if (rc == -1)
complainBitterly (errno);
Assuming you're after more readable code in addition to shorter, just create a function:
int closeWithRetry (int fd);
and place your readable code in there. Then it doesn't really matter how long it is, it's still a one-liner where you call it, but you can make the function body itself very readable:
int closeWithRetry (int fd) {
// Initial close attempt.
int rc = close (fd);
// As long as you failed with EINTR, keep trying.
// Possibly with a limit (count or time-based).
while ((rc == -1) && (errno == EINTR))
rc = close (fd);
// Once either success or non-retry failure, return error code.
return rc;
}
For the record: On essentially every UNIX, close() must not be retried if it returns EINTR. DO NOT put an EINTR retry-loop in place for close like you would for waitpid() or read(). See this page for more details: http://austingroupbugs.net/view.php?id=529 On linux, Solaris, BSD and others, retrying close() is incorrect. HP-UX is the only common(!) system I could find that requires this.
EINTR means something very different for read() and select() and waitpid() and so on than it does for close(). For most calls, you retry on EINTR because you asked for something to be done which blocks, and if you were interrupted that means it didn't happen, so you try again. For close(), the action you requested was for an entry to be removed from the fd table, which is instantaneous, without error, and will always happen no matter what close() returns.[*] The only reason close() blocks is that sometimes, for special semantics (like TCP linger), it can wait until I/O is done before returning. If close returns EINTR, that means that you asked it to wait but it couldn't. However, the fd was still closed; you just lost your chance to wait on it.
Conclusion: unless you know you can't receive signals, using close() for waiting is a very stupid thing to do. Use an application-level ACK (TCP) or an fsync (file I/O) to make sure any writes were completed before closing the fd.
[*] There is a caveat: if another thread of the process is inside a blocking syscall on the same fd, well, ... it depends.

Child process receives parent's SIGINT

I have one simple program that's using Qt Framework.
It uses QProcess to execute RAR and compress some files. In my program I am catching SIGINT and doing something in my code when it occurs:
signal(SIGINT, &unix_handler);
When SIGINT occurs, I check if RAR process is done, and if it isn't I will wait for it ... The problem is that (I think) RAR process also gets SIGINT that was meant for my program and it quits before it has compressed all files.
Is there a way to run RAR process so that it doesn't receive SIGINT when my program receives it?
Thanks
If you are generating the SIGINT with Ctrl+C on a Unix system, then the signal is being sent to the entire process group.
You need to use setpgid or setsid to put the child process into a different process group so that it will not receive the signals generated by the controlling terminal.
[Edit:]
Be sure to read the RATIONALE section of the setpgid page carefully. It is a little tricky to plug all of the potential race conditions here.
To guarantee 100% that no SIGINT will be delivered to your child process, you need to do something like this:
#define CHECK(x) if(!(x)) { perror(#x " failed"); abort(); /* or whatever */ }
/* Block SIGINT. */
sigset_t mask, omask;
sigemptyset(&mask);
sigaddset(&mask, SIGINT);
CHECK(sigprocmask(SIG_BLOCK, &mask, &omask) == 0);
/* Spawn child. */
pid_t child_pid = fork();
CHECK(child_pid >= 0);
if (child_pid == 0) {
/* Child */
CHECK(setpgid(0, 0) == 0);
execl(...);
abort();
}
/* Parent */
if (setpgid(child_pid, child_pid) < 0 && errno != EACCES)
abort(); /* or whatever */
/* Unblock SIGINT */
CHECK(sigprocmask(SIG_SETMASK, &omask, NULL) == 0);
Strictly speaking, every one of these steps is necessary. You have to block the signal in case the user hits Ctrl+C right after the call to fork. You have to call setpgid in the child in case the execl happens before the parent has time to do anything. You have to call setpgid in the parent in case the parent runs and someone hits Ctrl+C before the child has time to do anything.
The sequence above is clumsy, but it does handle 100% of the race conditions.
What are you doing in your handler? There are only certain Qt functions that you can call safely from a unix signal handler. This page in the documentation identifies what ones they are.
The main problem is that the handler will execute outside of the main Qt event thread. That page also proposes a method to deal with this. I prefer getting the handler to "post" a custom event to the application and handle it that way. I posted an answer describing how to implement custom events here.
Just make the subprocess ignore SIGINT:
child_pid = fork();
if (child_pid == 0) {
/* child process */
signal(SIGINT, SIG_IGN);
execl(...);
}
man sigaction:
During an execve(2), the dispositions of handled signals are reset to the default;
the dispositions of ignored signals are left unchanged.