I met one issue of signal handling under Linux, my target is to let process ignore SIGTERM signal. But sometimes, process still exited, the probability of this issue will be 1/60.
Fake code of my application:
static int g_count_sig_old = 0;
static volatile int g_count_sig = 0;
void _signalhandler(){
g_count_sig ++;
printf(...); // Maybe not safe, just for debugging
myTrace(...); // Write log to file1 is not signal safe, but just for debugging.
}
main(){
sigaction(...) // Register signal handler for SIGTERM
while(1){
sleep(1000); // wait one second
myNewTrace(...); // Output value of g_count_sig to file2
if( g_count_sig != g_count_sig_old ) {
g_count_sig_old = g_count_sig;
printf(...); // output value of g_count_sig
myNewTrace(...); // Output value of g_count_sig to file2
}
}
}
I suppose this application will not quit when receiving signal SIGTERM, but actual testing result didn't match my design. Some times, the process still exit after receiving signal SIGTERM. And I confirm the process received SIGTERM signal when issue occurred, I can observe console output and trace file.
So I feel puzzled, why does this application exit even if ignoring SIGTERM? I am not sure how to position the cause of this issue, or it is reasonable symptom under Linux.
Hope to get your help. Thanks!
Related
I have a main process and some child process spawn from it. At a point of time i have to give SIGINT signal to all the child process but not to main process. I am unable to store pid's for all child processes. So i used SIG_IGN for ignoring SIGINT in main process and set to default after my action. But it is not working.
Please find my code snippet below:
/* Find group id for process */
nPgid = getpgid(parentPID);
/* Ignore SIGINT signal in parent process */
if (signal(SIGINT, SIG_IGN) == SIG_ERR)
{
cout << "Error in ignoring signal \n");
}
/* Send SIGINT signal to all process in the group */
nReturnValue = kill ( (-1 * nPgid), SIGINT);
if (nReturnValue == RETURN_SUCCESS)
{
cout << "Sent SIGINT signal to all process in group successfully \n";
}
else
{
cout << "Alert!!! Unable to send SIGINT signal to all process in the group \n";
}
/* Set SIGINT signal status to default */
signal (SIGINT, SIG_DFL);
sleep(2);
I am not getting any error. But parent is getting killed. Am i doing anything wrong here?
nPgid = getpgid(parentPID);
What is parentPID? The get the group of the calling process either pass 0 or the result of getpid().
From man getpgid():
getpgid() returns the PGID of the process specified by pid. If pid
is zero, the process ID of the calling process is used. (Retrieving
the PGID of a process other than the caller is rarely necessary, and
the POSIX.1 getpgrp() is preferred for that task.)
From this text above I'd draw the conclusion to do
nPgid = getpgid(o);
I'm trying to create a parent and a child processes that would communicate through a pipe.
I've setup the child to listen to its parent through a pipe, with a read command running in a while loop.
In order to debug my program I print debug messages to the standard output (note that my read command is set to the pipe with a file descriptor different than 0 or 1).
From some reason these debug messages are being received in the read command of my child process. I can't understand why this is happening. What could be causing this? What elegant solution do I have to solve it (apart from writing to the standard error instead of output)?
This code causes an endless loop because of the cout message that just triggers another read. Why? Notice that the child process exists upon receiving a CHILD_EXIT_CODE signal from parent.
int myPipe[2]
pipe(myPipe);
if(fork() == 0)
{
int readPipe = myPipe[0];
while(true)
{
size_t nBytes = read(readPipe, readBuffer, sizeof(readBuffer));
std::cout << readBuffer << "\n";
int newPosition = atoi(readBuffer);
if(newPosition == CHILD_EXIT_CODE)
{
exit(0);
}
}
}
Edit: Code creating the pipe and fork
I do not know what is doing your parent process (you did not post your code), but because of your description it seems like your parent and child processes are sharing the same stdout stream (the child inherits copies of the parent's set of open file descriptors; see man fork)
I guess, what you should do is to attach stdout and stderr streams in your parent process to the write side of your pipes (you need one more pipe for the stderr stream)
This is what I would try if I were in your situation (in my opinion you are missing dup2):
pid_t pid; /*Child or parent PID.*/
int out[2], err[2]; /*Store pipes file descriptors. Write ends attached to the stdout*/
/*and stderr streams.*/
// Init value as error.
out[0] = out[1] = err[0] = err[1] = -1;
/*Creating pipes, they will be attached to the stderr and stdout streams*/
if (pipe(out) < 0 || pipe(err) < 0) {
/* Error: you should log it */
exit (EXIT_FAILURE);
}
if ((pid=fork()) == -1) {
/* Error: you should log it */
exit (EXIT_FAILURE);
}
if (pid != 0) {
/*Parent process*/
/*Attach stderr and stdout streams to your pipes (their write end)*/
if ((dup2(out[1], 1) < 0) || (dup2(err[1], 2) < 0)) {
/* Error: you should log it */
/* The child is going to be an orphan process you should kill it before calling exit.*/
exit (EXIT_FAILURE);
}
/*WHATEVER YOU DO WITH YOUR PARENT PROCESS*/
/* The child is going to be an orphan process you should kill it before calling exit.*/
exit(EXIT_SUCCESS);
}
else {
/*Child process*/
}
You should not forget a couple of things:
wait or waitpid to release associated memory to child process when it dies. wait or waitpid must be called from parent process.
If you use wait or waitpid you might have to think about blocking SIGCHLD before calling fork and in that case you should unblock SIGCHLD in your child process right after fork, at the beginning of your child process code (A child created via fork(2) inherits a copy of its parent's signal mask; see sigprocmask).
.
Something that many times is forgotten. Be aware of EINTR error. dup2, waitpid/wait, read and many others are affected by this error.
If your parent process dies before your child process you should try to kill the child process if you do not want it to become an orphan one.
Take a look at _exit. Perhaps you should use it in your child process instead of exit.
In the main program main loop, I'm listening on a EMS topic by calling tibemsMsgConsumer_Receive. Meanwhile, I want to exit the program at specific time, say 5PM. How can I implement this?
I tried to use the following code but it doesn't work properly in the case there is no message received.
Is there a way I can exit the program when 'while' loop is stuck there?
while (1)
{
status = tibemsMsgConsumer_Receive(m_CmbsSpreadMatrixSubscriber, &msg);
if (status == TIBEMS_OK)
{
DoSomething();
}
if (getRunTime("hour").c_str()) >= 18)
{
exit(0);
}
}
Use tibemsMsgConsumer_ReceiveTimeout() and set an appropriate timeout to check your exit condition repeatedly.
From the description on that page:
This function consumes the next message from the consumer’s destination. When the destination does not have any messages ready, this function blocks:
If a message arrives at the destination, this call immediately consumes that message and returns.
If the (non-zero) timeout elapses before a message arrives, this call returns TIBEMS_TIMEOUT.
If another thread closes the consumer, this call returns TIBEMS_INTR.
before starting the main loop listening on message, I start a thread.
boost::thread aThread(&threadFunc);
and in the thread function I simply count time and exit the program. Not sure if
it's safe and right or not...
void threadFunc()
{
while (true)
{
wait(60);
if (atoi(getRunTime("hour").c_str()) >= 18)
{
Log("Now it's 6PM, let's stop and get back tomorrow.");
exit(0);
}
}
}
I'm making a web server on linux in C++ with pthreads. I tested it with valgrind for leaks and memory problems - all fixed. I tested it with helgrind for thread problems - all fixed. I'm trying a stress test. I'm getting problem when the probram is run with helgrind
valgrind --tool=helgrind ./chats
It just dies on random places with the text "Killed" as it would do when I kill it with kill -9. The only report I get sometimes from helgrind is that the program exists while still holding some locks, which is normal when gets killed.
When checking for leaks:
valgrind --leak-check=full ./chats
it's more stable, but I managed to make it die once with few hundreds of concurrent connections.
I tried running program alone and couldn't make it crash at all. I tried up to 250 concurrent connections. Each thread delays with 100ms to make it easier to have multiple connections at the same time. No crash.
In all cases threads as well as connections do not get above 10 and I see it crash even with 2 connections, but never with only one connection at the same time (with including main thread and one helper thread is total of 3).
Is it possible that the problem will only happen when run with
helgrind or just helgrind makes it more likely to show?
What be the reason that a program gets killed (by kernel?) Allocating too much memory, too many file descriptors?
I tested a bit more and I found out that it only dies when the client times out and closes the connection. So here is the code which detects that the client closed the socket:
void *TcpClient::run(){
int ret;
struct timeval tv;
char * buff = (char *)malloc(10001);
int br;
colorPrintf(TC_GREEN, "new client starting: %d\n", sockFd);
while(isRunning()){
tv.tv_sec = 0;
tv.tv_usec = 500*1000;
FD_SET(sockFd, &readFds);
ret = select(sockFd+1, &readFds, NULL, NULL, &tv);
if(ret < 0){
//select error
continue;
}else if(ret == 0){
// no data to read
continue;
}
br = read(sockFd, buff, 10000);
buff[br] = 0;
if (br == 0){
// client disconnected;
setRunning(false);
break;
}
if (reader != NULL){
reader->tcpRead(this, std::string(buff, br));
}else{
readBuffer.append(buff, br);
}
//printf("received: %s\n", buff);
}
free(buff);
sendFeedback((void *)1);
colorPrintf(TC_RED, "closing client socket: %d\n", sockFd);
::close(sockFd);
sockFd = -1;
return NULL;
}
// this method writes to socket
bool TcpClient::write(std::string data){
int bw;
int dataLen = data.length();
bw = ::write(sockFd, data.data(), dataLen);
if (bw != dataLen){
return false; // I don't close the socket in this case, maybe I should
}
return true;
}
P.S. Threads are:
main thread. connections are accepted here.
one helper thread which listen for signals and sends signals. It stops signal reception for the app and manually polls the signal queue. The reason is because it's hard to handle signals when using threads. I found this technique here in stackoverflow and it seams to work pretty fine in other projects.
client connection threads
The full code is pretty big, but I can post chunks if someone is interested.
Update:
I managed to trigger the problem with only one connection. It's all happening in client thread. This is what I do:
I read/parse headers. I put delay before writing so the client can timeout (which causes the problem).
Here the client timeouts and leaves (probably closes socket)
I write back headers
I write back the html code.
Here is how I write back
bw = ::write(sockFd, data.data(), dataLen);
// bw is = dataLen = 108 when writing the headers
//then secondary write for HTML kills the program. there is a message before and after write()
bw = ::write(sockFd, data.data(), dataLen); // doesn't go past this point second time
Update 2: Got it :)
gdb sais:
Program received signal SIGPIPE, Broken pipe.
[Switching to Thread 0x41401940 (LWP 10554)]
0x0000003ac2e0d89b in write () from /lib64/libpthread.so.0
Question 1: What should I do to void receiving this signal.
Question 2: How to know that remote side disconnected while writing. On read select returns that there is data but data read is 0. How about write?
Well I just had to handle the SIGPIPE singal and write returned -1 -> I close socket and quit thread gracefully. Works like a charm.
I guess the easiest way is to set signal handler of SIGPIPE to SIG_IGN:
signal(SIGPIPE, SIG_IGN);
Note that first write was successful and didn't kill the program. If you have similar problem check if you are writing once or multiple times. If you are not familiar with gdb this is how to do it:
gdb ./your-program
> run
and gdb will tell you all about signals and sigfaults.
I'm trying to make a Win32/*nix console-based ASCII game. I want to use no libraries whatsoever that aren't standard C++ or on *nix/windows(.h).
I want it to be structured like a game loop. Aka:
while (!WIN_CLOSE_FUNCTION()) {
//Do crap
}
//Do other shutdown crap
return 0;
Can anyone point me to what function this would be? If it is platform dependent, give me one example on Windows and *nix.
For the Unix/Linux console, there is no such function. The closest you can do is to catch the signal SIGHUP which is sent when losing the terminal. However be aware that the things you can do in a signal handler are quite limited. Probably the closest to your loop would be (note: untested code):
#include <signal.h>
volatile sig_atomic_t hupflag = 0;
extern "C" void hangup(int)
{
hupflag = 1;
}
int main()
{
sigaction act;
act.sa_handler = hangup;
act.sa_mask = 0;
act.sa_flags = 0;
if (sigaction(SIGHUP, &act, 0) < 0)
{
std::cerr << "could not install signal handler\n";
return 1;
}
while (!hupflag)
{
// ...
}
// shutdown
return 0;
}
Similar question that might help you What happens when you close a c++ console application
The accepted answer is:
Closing a c++ console app with the "x" in the top corner throws an CTRL_CLOSE_EVENT which you could catch and process if you set a control handler using the SetConsoleCtrlHandler function.
Useful links:
Console Event Handling
SetConsoleCtrlHandler
On *nix:
On Linux and other Unix systems, the console runs as a separate process. As you close the shell, it sends the SIGHUP signal to the currently active process or processes that are not executed in the background. If the programmer does not handle it, the process simply terminates. The same signal is sent if you close the SSH session with a terminal and an active process.
answer provided by #Zyx in the question linked above
There isn't such a function per se, but both Unix and Windows will send
a signal (SIGHUP under Unix, SIGBREAK under Windows) to all
processes in the process group when the window on which the process
group depends is closed. So all you have to do is catch the signal and
set a flag, which you test in the loop:
#ifdef _WIN32
int const sigClosed = SIGBREAK;
#else
int const sigClosed = SIGHUP;
#endif
volatile sig_atomic_t windowClosed = 0;
void signalHandler( int )
{
windowClosed = 1;
}
// ...
signal( sigClosed, signalHandler );
while ( windowClosed == 0 ) {
// ...
}
If you're doing any input from the console in the loop, you'll have the
be prepared for the input to fail (which you should be anyway).