is recursive mutex lock? - c++

I have a question about mutex_lock. My code does the following:
RIL_startEventLoop lock the mutex by pthread_mutex_lock(&s_startupMutex);
Call the thread eventLoop;
eventLoop locks the mutex by pthread_mutex_lock(&s_startupMutex);
eventLoop unlocks the mutex by pthread_mutex_unlock(&s_startupMutex);
RIL_startEventLoop unlocks the mutex by pthread_mutex_unlock(&s_startupMutex);
My understanding is the following (correct me if wrong):
Lock the mutex exclusively for this task. If the mutex is not available right now, it will sleep until it can get it.
The mutex must later on be released by the same task that acquired it. Recursive locking is not allowed.
Why can eventLoop lock same mutex that is not released by RIL_startEventLoop?
These 2 functions are in Ril.cpp:
RIL_startEventLoop(void) {
int ret;
pthread_attr_t attr;
#ifdef MTK_RIL
RIL_startRILProxys();
#endif /* MTK_RIL */
/* spin up eventLoop thread and wait for it to get started */
s_started = 0;
**pthread_mutex_lock(&s_startupMutex);**
pthread_attr_init (&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
ret = pthread_create(&s_tid_dispatch, &attr, **eventLoop**, NULL);
while (s_started == 0) {
pthread_cond_wait(&s_startupCond, &s_startupMutex);
}
**pthread_mutex_unlock(&s_startupMutex);**
if (ret < 0) {
LOGE("Failed to create dispatch thread errno:%d", errno);
return;
}
}
static void *
**eventLoop(void *param)** {
int ret;
int filedes[2];
ril_event_init();
**pthread_mutex_lock(&s_startupMutex);**
s_started = 1;
pthread_cond_broadcast(&s_startupCond);
**pthread_mutex_unlock(&s_startupMutex);**
ret = pipe(filedes);
if (ret < 0) {
LOGE("Error in pipe() errno:%d", errno);
return NULL;
}
s_fdWakeupRead = filedes[0];
s_fdWakeupWrite = filedes[1];
fcntl(s_fdWakeupRead, F_SETFL, O_NONBLOCK);
ril_event_set (&s_wakeupfd_event, s_fdWakeupRead, true,
processWakeupCallback, NULL);
rilEventAddWakeup (&s_wakeupfd_event);
// Only returns on error
ril_event_loop();
LOGE ("error in event_loop_base errno:%d", errno);
return NULL;
}

Because the call to pthread_cond_wait() releases the mutex.
pthread_cond_wait() does this in an atomic fashion.
Release the mutex (It must already be acquired)
Suspend the thread until the condition is signalled
Acquires the mutex

Related

sigwait() does not work in multithreaded program

I'm trying to write a multithreaded program which one thread (variable thread in below) is responsible to any asynchronous signals that might be set to this process.
I am facing thread that uses sigwait() but does not react to any signals have been sent to process. (like SIGUSR1 in below).
static void * signal_thread(void *arg = nullptr)
{
int sig = -1;
sigset_t sigset;
sigfillset(&sigset);
pthread_sigmask(SIG_BLOCK, &sigset, NULL);
while(1)
{
int s = sigwait(&sigset, &sig);
if(s == 0)
printf("SIG %d recieved!...\n", sig);
usleep(20);
}
}
int main()
{
sigset_t signalset;
pthread_t thread;
pthread_create(&thread, NULL, &signal_thread, nullptr);
sigfillset(&signalset);
pthread_sigmask(SIG_BLOCK, &signalset, NULL);
while(1)
{
raise(SIGUSR1);
usleep(20);
}
}
The problem is concerned to two issues:
First, call of raise in main sent signal only to main thread not whole process.
Secondly, std::cout should be used instead of printf in signal_thread.
raise(sig) is the equivalent of calling pthread_kill(pthread_self(), sig).
Since the main thread raise()s the signal, the SIGUSR1 will be generated for that thread and not for any other. Thus, your signal_thread will be unable to sigwait() for the USR1, which will be held pending for the thread that generated it.

Windows Errorcode : 298 for Bounded buffer solution in vc++

I have encountered bounded buffer problem in my project, For solving
this, im using 2 semaphores Full and empty.
Write operation waits for empty semaphore signal and signals full
semaphore after finishing write.
Read operation waits for Full semaphore signal and signals Empty
semaphore after read.
Since im using blocking calls for read and write, each read and
write happens in sequence.
i'm implementing this in VC++ in windows, but im facing windows
errorcode:298 while signalling empty semaphore which says Too many posts were made to a semaphore.
what would be the possible causes for the error 'too may posts were
made to a semaphore' ?
List item
semaphore creation:
string semName = m_mqName;
semName.append(SEMAPHORE_FULL_NAME_SUFFIX);
cout<<"\n <MessageQueue<DType, size>::CreateMsgQSemaphores ()> Semaphore name = "<<semName<<endl;
m_mqFullSemaphore = CreateSemaphore(
NULL, // default security attributes
0, // initial count
size, // maximum count
semName.c_str()); //semaphore name
if (m_mqFullSemaphore == nullptr)
{
cout<<"ERROR::CreateSemaphore Failed, Name:"<<semName<<",error code:"<<GetLastError()<<endl;
CloseHandle(m_mqMutex); // close the existing m_mqMutex
createSemaphoresStatus = CreateMsgQSemaphoresFailedError;
}
else if (ERROR_ALREADY_EXISTS == GetLastError())
{
cout<<"\n <MessageQueue<DType, size>::CreateMsgQSemaphores ()>::WARNING: 'full' Semaphore exist.. "<<semName<<endl;
}
else
//cout<<"***INFO:: semaphore created: m_mqFullSemaphore= "<<m_mqFullSemaphore<<endl;
//------------------------------------------Empty semaphore creation--------------------------------------------//
semName = m_mqName;
semName.append(SEMAPHORE_EMPTY_NAME_SUFFIX);
//cout<<"\n <MessageQueue<DType, size>::CreateMsgQSemaphores ()> Semaphore name = "<<semName<<endl;
m_mqEmptySemaphore = CreateSemaphore(
NULL, // default security attributes
size, // initial count
size, // maximum count
semName.c_str()); // semaphore Name
if(m_mqEmptySemaphore == nullptr)
{
cout<<"\n <MessageQueue<DType, size>::CreateMsgQSemaphores ()>::ERROR: Create empty Semaphore failed.. "<<semName<<endl;
CloseHandle(m_mqMutex); // close the existing m_mqMutex
createSemaphoresStatus = CreateMsgQSemaphoresFailedError;
}
Consumer thread (reader in my project)
DWORD dwFullSemaphoreWaitResult = WaitForSingleObject(m_mqFullSemaphore,
timeoutVal);//wair for full semaphore
if(dwFullSemaphoreWaitResult== WAIT_OBJECT_0) // got the full semaphore
{
//proceed further
DWORD dwMutexWaitResult = WaitForSingleObject( m_mqMutex, timeoutVal); // no time-out interval
//Queue_Mutex_Handler mutexHandler(m_mqMutex);
//RAII
LOG_MSG("SUCCESS: to aquire m_mqFullSemaphore:"<<m_mqName);
switch (dwMutexWaitResult)
{
case WAIT_OBJECT_0: // got ownership of the mutex
{
LOG_MSG("SUCCESS: to aquire m_mqMutex:"<<m_mqName);
size_t qSize = 0;
if(! m_pMsgQueueImpl->Dequeue(destMsg,qSize))
{
LOG_MSG("SUCCESS: Reached here:"<<m_mqName);
LOG_MSG("ERROR: Dequeue failed, MSG Queue is Empty:"<< m_mqName);
//ReleaseMutex(m_mqMutex);
execResult = MQState_Queue_Empty_Error;
if(0 == ReleaseMutex(m_mqMutex))
{
LOG_MSG("Release mutex error:"<<GetLastError());
}
}
else
{
int semCount = 0;
LOG_MSG("MQ POP successful:"<< m_mqName<<", qSize="<<qSize);
//ReleaseMutex(m_mqMutex);
if(0 == ReleaseMutex(m_mqMutex))
{
LOG_MSG("Release mutex error:"<<GetLastError());
}
if ( 0 == ReleaseSemaphore(
m_mqEmptySemaphore, // handle to semaphore
1, // increase count by one
NULL)) // not interested in previous count
{
//LOG_MSG("semCount = "<<semCount);
LOG_MSG("Release Empty Semaphore error: "<<GetLastError());
}
else
{
//LOG_MSG("semCount = "<<semCount);
LOG_MSG("empty Semaphore signalled successfully");
}
return (int)qSize;
}
break;
}
case WAIT_TIMEOUT:
{
LOG_MSG("ERROR: Failed to aquire Mutex:"<<m_mqName<<", due to Timeout:"<<timeoutVal);
execResult = MQState_QOpTimeOut_Error;
break;
}
default: // The thread got ownership of an abandoned mutex
{
LOG_MSG("ERROR: Failed to aquire Mutex:"<<m_mqName<<", due to GetLastError:"<<GetLastError());
execResult = MQState_Queue_Unhandled_Error;
}
} // end of switch (dwMutexWaitResult)
}
else if(dwFullSemaphoreWaitResult == WAIT_TIMEOUT)
{
LOG_MSG("ERROR: Failed to aquire m_mqFullSemaphore:"<<m_mqName<<", due to Timeout:"<<timeoutVal);
execResult = MQState_QOpTimeOut_Error;
}
else
{
LOG_MSG("ERROR: Failed to aquire m_mqFullSemaphore:"<<m_mqName<<", GetLastError:"<<GetLastError());
execResult = MQState_Queue_Unhandled_Error;
}
if(execResult != MQState_QOp_Success)
return execResult;
//=================================================================================================
//LOG_FUNC_END;
return execResult;
Producer thread
MSG_QUEUE_STATUS execResult = MQState_QOp_Success;
//Wait for empty semaphore
DWORD dwSemaphoreWaitResult = WaitForSingleObject( m_mqEmptySemaphore, // handle to mutex
timeoutValInMs); // time-out interval
LOG_MSG("m_mqEmptySemaphore: "<<m_mqEmptySemaphore);
LOG_MSG("Got the m_mqEmptySemaphore");
//Wait for mutex
if(dwSemaphoreWaitResult == WAIT_OBJECT_0)
{
DWORD dwMutexWaitResult = WaitForSingleObject( m_mqMutex, // handle to mutex
timeoutValInMs); // time-out interval
//Queue_Mutex_Handler mutexHandler(m_mqMutex);
LOG_MSG("Got the m_mqMutex");
switch(dwMutexWaitResult)
{
case WAIT_OBJECT_0:
LOG_MSG("going to send Message");
if(m_pMsgQueueImpl->Enqueue(srcMsg) )
{
LOG_MSG("Message Sent successfully");
//int semCount;
if(0 == ReleaseMutex(m_mqMutex))
{
LOG_MSG("Release mutex error:"<<GetLastError());
}
if ( 0 == ReleaseSemaphore(
m_mqFullSemaphore, // handle to semaphore
1, // increase count by one
NULL)) // not interested in previous count
{
//LOG_MSG("semCount = "<<semCount);
LOG_MSG("Release full Semaphore error: "<<GetLastError());
}
else
{
//LOG_MSG("semCount = "<<semCount);
LOG_MSG("full Semaphore signalled successfully");
}
///++++++++++++++
}
else
{
LOG_MSG("ERROR: Enqueue failed, MSG Queue is Full, QName = "<< m_mqName);
if(0 == ReleaseMutex(m_mqMutex))
{
LOG_MSG("Release mutex error:"<<GetLastError());
}
execResult = MQState_Queue_Full_Error;
}
break;
case WAIT_TIMEOUT:
LOG_MSG("ERROR: Failed to aquire MsgQueue Mutex:"<<m_mqName<<", due to Timeout:"<<timeoutValInMs);
execResult = MQState_QOpTimeOut_Error;
break;
default:
LOG_MSG("ERROR: Failed to aquire MsgQueue Mutex:"<<m_mqName);
execResult = MQState_Queue_Unhandled_Error;
}//switch ends
}
else if(WAIT_TIMEOUT==dwSemaphoreWaitResult)
{
LOG_MSG("ERROR: Failed to aquire MsgQueue semaphore:"<<m_mqName<<", due to Timeout:"<<timeoutValInMs);
execResult = MQState_QOpTimeOut_Error;
}
else
{
LOG_MSG("ERROR: Failed to aquire MsgQueue semaphore:"<<m_mqName);
execResult = MQState_Queue_Unhandled_Error;
}
//RAII
//LOG_FUNC_END;
return execResult;

Wait notify pthreads unix C++

I have n threads , each modifying an object O(k) where k can be 0 to n-1.
Now there is a listener thread l, that needs to get an alert when any of the thread,k, has modified its object O(k)
What is the fastest way to implement this situation ?
Use a Posix (or even better, std C++) condition variable, as one commentor already suggested. You can use the related mutex to protect a std::array of flags, one flag per worker thread. When a worker thread modifies its object, it acquires mutex and raises its flag. When the listener thread is notified, it will service the k:th object (corresponding to k:th flag in array) and lower the flag, then release mutex.
Be sure to read examples for condvars so you understand when mutex is automatically acquired/released.
In general, std C++ threading primitives are easier to use, since they use e.g. RAII for automatic unlocking of mutexes etc. Also portable to non-Posix environments. But here is a pthreads example from
http://www.yolinux.com/TUTORIALS/LinuxTutorialPosixThreads.html
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
pthread_mutex_t count_mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t condition_var = PTHREAD_COND_INITIALIZER;
void *functionCount1();
void *functionCount2();
int count = 0;
#define COUNT_DONE 10
#define COUNT_HALT1 3
#define COUNT_HALT2 6
main()
{
pthread_t thread1, thread2;
pthread_create( &thread1, NULL, &functionCount1, NULL);
pthread_create( &thread2, NULL, &functionCount2, NULL);
pthread_join( thread1, NULL);
pthread_join( thread2, NULL);
printf("Final count: %d\n",count);
exit(EXIT_SUCCESS);
}
// Write numbers 1-3 and 8-10 as permitted by functionCount2()
void *functionCount1()
{
for(;;)
{
// Lock mutex and then wait for signal to relase mutex
pthread_mutex_lock( &count_mutex );
// Wait while functionCount2() operates on count
// mutex unlocked if condition varialbe in functionCount2() signaled.
pthread_cond_wait( &condition_var, &count_mutex );
count++;
printf("Counter value functionCount1: %d\n",count);
pthread_mutex_unlock( &count_mutex );
if(count >= COUNT_DONE) return(NULL);
}
}
// Write numbers 4-7
void *functionCount2()
{
for(;;)
{
pthread_mutex_lock( &count_mutex );
if( count < COUNT_HALT1 || count > COUNT_HALT2 )
{
// Condition of if statement has been met.
// Signal to free waiting thread by freeing the mutex.
// Note: functionCount1() is now permitted to modify "count".
pthread_cond_signal( &condition_var );
}
else
{
count++;
printf("Counter value functionCount2: %d\n",count);
}
pthread_mutex_unlock( &count_mutex );
if(count >= COUNT_DONE) return(NULL);
}
}

Readers Writers - Writer thread always stuck with multiple reader thread

New bie here.
I have been working on readers/ writers problem solution.
It works perfectly fine with 1 reader and 1 writer.
But when I modify reader to 2; writer thread always starves. Help me!
It seems Writer thread is stuck somewhere waiting for wrt mutex.
#include <stdio.h>
#include <conio.h>
#include <windows.h>
HANDLE mutex, wrt;
int g_ReadCount = 0;
int g_GlobalData=0;
const int max = 2;
HANDLE reader[max], writer[max];
CRITICAL_SECTION rSect, wSect;
bool bTerminate = true;
DWORD Readers(LPVOID lpdwThreadParam )
{
while(bTerminate)
{
WaitForSingleObject(mutex, INFINITE);
g_ReadCount++;
if(g_ReadCount == 1)
{
WaitForSingleObject(wrt, INFINITE);
}
ReleaseMutex(mutex);
EnterCriticalSection(&wSect);
printf("ThreadId : %d --> Read data : %d ReaderCount %d\n", GetCurrentThreadId(), g_GlobalData, g_ReadCount);
LeaveCriticalSection(&wSect);
WaitForSingleObject(mutex, INFINITE);
g_ReadCount--;
if(g_ReadCount == 0)
{
ReleaseMutex(wrt);
printf("ThreadId : %d Realesed Mutex wrt\n", GetCurrentThreadId());
}
printf("ThreadId : %d ReaderCount %d\n", GetCurrentThreadId(), g_ReadCount);
ReleaseMutex(mutex);
printf("Reader ThreadId : %d Realesed Mutex mutex\n", g_ReadCount);
Sleep(0);
}
return 0;
}
DWORD Writers(LPVOID lpdwThreadParam )
{
int n = GetCurrentThreadId();
int temp = 1;
while(bTerminate)
{
printf("ThreadId : %d Waiting for WRT\n", GetCurrentThreadId());
WaitForSingleObject(wrt, INFINITE);
printf("WRITER ThreadId : %d ***Got WRT\n", GetCurrentThreadId());
++n;
temp++;
if(temp == 100)
{
//bTerminate = false;
}
EnterCriticalSection(&wSect);
printf("Write by ThreadId : %d Data : %d Temp %d\n", GetCurrentThreadId(), n, temp);
g_GlobalData = n;
LeaveCriticalSection(&wSect);
ReleaseMutex(wrt);
}
printf("***VVV***Exiting Writer Thread\n");
return 0;
}
void main()
{
mutex = CreateMutex(NULL, false, "Writer");
wrt = CreateMutex(NULL, false, "wrt");
InitializeCriticalSection(&rSect);
InitializeCriticalSection(&wSect);
DWORD dwThreadId = 0;
for(int i=0; i < max; i++)
{
reader[i] = CreateThread(NULL, //Choose default security
0, //Default stack size
(LPTHREAD_START_ROUTINE)&Readers,
//Routine to execute
(LPVOID) 0, //Thread parameter
0, //Immediately run the thread
&dwThreadId //Thread Id
);
}
for(int i=0; i < 1; i++)
{
writer[i] = CreateThread(NULL, //Choose default security
0, //Default stack size
(LPTHREAD_START_ROUTINE)&Writers,
//Routine to execute
(LPVOID) 0, //Thread parameter
0, //Immediately run the thread
&dwThreadId //Thread Id
);
}
getchar();
}
With more than 1 reader thread, it is quite likely that g_ReadCount will never get to zero, so the wrt mutex will never be released (thus starving the writer). You probably need some kind of indicator that the writer thread is waiting. Then the reader threads would need to give precedence to the writer at some point.
For example, in one implementation I wrote (not saying it is a great way, but it worked) I used a flag that was set/cleared via atomic increment/decrement operations that indicated if a writer thread was waiting for the lock. If so, the readers would hold off. Of course, in that case you also need to then be careful of the opposite situation where writer threads (if more than one) could starve readers. Read/Write locks are tricky.
While working on this problem; I found interesting issue.
During study; we told that Semaphore with max count =1 is equal to Mutex. That is not entirely true.
1) Mutex can not be released by any other thread.
2) Semaphore can be used in such situation.

Waitpid equivalent with timeout?

Imagine I have a process that starts several child processes. The parent needs to know when a child exits.
I can use waitpid, but then if/when the parent needs to exit I have no way of telling the thread that is blocked in waitpid to exit gracefully and join it. It's nice to have things clean up themselves, but it may not be that big of a deal.
I can use waitpid with WNOHANG, and then sleep for some arbitrary time to prevent a busy wait. However then I can only know if a child has exited every so often. In my case it may not be super critical that I know when a child exits right away, but I'd like to know ASAP...
I can use a signal handler for SIGCHLD, and in the signal handler do whatever I was going to do when a child exits, or send a message to a different thread to do some action. But using a signal handler obfuscates the flow of the code a little bit.
What I'd really like to do is use waitpid on some timeout, say 5 sec. Since exiting the process isn't a time critical operation, I can lazily signal the thread to exit, while still having it blocked in waitpid the rest of the time, always ready to react. Is there such a call in linux? Of the alternatives, which one is best?
EDIT:
Another method based on the replies would be to block SIGCHLD in all threads with pthread \ _sigmask(). Then in one thread, keep calling sigtimedwait() while looking for SIGCHLD. This means that I can time out on that call and check whether the thread should exit, and if not, remain blocked waiting for the signal. Once a SIGCHLD is delivered to this thread, we can react to it immediately, and in line of the wait thread, without using a signal handler.
Don't mix alarm() with wait(). You can lose error information that way.
Use the self-pipe trick. This turns any signal into a select()able event:
int selfpipe[2];
void selfpipe_sigh(int n)
{
int save_errno = errno;
(void)write(selfpipe[1], "",1);
errno = save_errno;
}
void selfpipe_setup(void)
{
static struct sigaction act;
if (pipe(selfpipe) == -1) { abort(); }
fcntl(selfpipe[0],F_SETFL,fcntl(selfpipe[0],F_GETFL)|O_NONBLOCK);
fcntl(selfpipe[1],F_SETFL,fcntl(selfpipe[1],F_GETFL)|O_NONBLOCK);
memset(&act, 0, sizeof(act));
act.sa_handler = selfpipe_sigh;
sigaction(SIGCHLD, &act, NULL);
}
Then, your waitpid-like function looks like this:
int selfpipe_waitpid(void)
{
static char dummy[4096];
fd_set rfds;
struct timeval tv;
int died = 0, st;
tv.tv_sec = 5;
tv.tv_usec = 0;
FD_ZERO(&rfds);
FD_SET(selfpipe[0], &rfds);
if (select(selfpipe[0]+1, &rfds, NULL, NULL, &tv) > 0) {
while (read(selfpipe[0],dummy,sizeof(dummy)) > 0);
while (waitpid(-1, &st, WNOHANG) != -1) died++;
}
return died;
}
You can see in selfpipe_waitpid() how you can control the timeout and even mix with other select()-based IO.
Fork an intermediate child, which forks the real child and a timeout process and waits for all (both) of its children. When one exits, it'll kill the other one and exit.
pid_t intermediate_pid = fork();
if (intermediate_pid == 0) {
pid_t worker_pid = fork();
if (worker_pid == 0) {
do_work();
_exit(0);
}
pid_t timeout_pid = fork();
if (timeout_pid == 0) {
sleep(timeout_time);
_exit(0);
}
pid_t exited_pid = wait(NULL);
if (exited_pid == worker_pid) {
kill(timeout_pid, SIGKILL);
} else {
kill(worker_pid, SIGKILL); // Or something less violent if you prefer
}
wait(NULL); // Collect the other process
_exit(0); // Or some more informative status
}
waitpid(intermediate_pid, 0, 0);
Surprisingly simple :)
You can even leave out the intermediate child if you're sure no other module in the program is spwaning child processes of its own.
This is an interesting question.
I found sigtimedwait can do it.
EDIT 2016/08/29:
Thanks for Mark Edington's suggestion. I'v tested your example on Ubuntu 16.04, it works as expected.
Note: this only works for child processes. It's a pity that seems no equivalent way of Window's WaitForSingleObject(unrelated_process_handle, timeout) in Linux/Unix to get notified of unrelated process's termination within timeout.
OK, Mark Edington's sample code is here:
/* The program creates a child process and waits for it to finish. If a timeout
* elapses the child is killed. Waiting is done using sigtimedwait(). Race
* condition is avoided by blocking the SIGCHLD signal before fork().
*/
#include <sys/types.h>
#include <sys/wait.h>
#include <signal.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
static pid_t fork_child (void)
{
int p = fork ();
if (p == -1) {
perror ("fork");
exit (1);
}
if (p == 0) {
puts ("child: sleeping...");
sleep (10);
puts ("child: exiting");
exit (0);
}
return p;
}
int main (int argc, char *argv[])
{
sigset_t mask;
sigset_t orig_mask;
struct timespec timeout;
pid_t pid;
sigemptyset (&mask);
sigaddset (&mask, SIGCHLD);
if (sigprocmask(SIG_BLOCK, &mask, &orig_mask) < 0) {
perror ("sigprocmask");
return 1;
}
pid = fork_child ();
timeout.tv_sec = 5;
timeout.tv_nsec = 0;
do {
if (sigtimedwait(&mask, NULL, &timeout) < 0) {
if (errno == EINTR) {
/* Interrupted by a signal other than SIGCHLD. */
continue;
}
else if (errno == EAGAIN) {
printf ("Timeout, killing child\n");
kill (pid, SIGKILL);
}
else {
perror ("sigtimedwait");
return 1;
}
}
break;
} while (1);
if (waitpid(pid, NULL, 0) < 0) {
perror ("waitpid");
return 1;
}
return 0;
}
If your program runs only on contemporary Linux kernels (5.3 or later), the preferred way is to use pidfd_open (https://lwn.net/Articles/789023/ https://man7.org/linux/man-pages/man2/pidfd_open.2.html).
This system call returns a file descriptor representing a process, and then you can select, poll or epoll it, the same way you wait on other types of file descriptors.
For example,
int fd = pidfd_open(pid, 0);
struct pollfd pfd = {fd, POLLIN, 0};
poll(&pfd, 1, 1000) == 1;
The function can be interrupted with a signal, so you could set a timer before calling waitpid() and it will exit with an EINTR when the timer signal is raised. Edit: It should be as simple as calling alarm(5) before calling waitpid().
I thought that select will return EINTR when SIGCHLD signaled by on of the child.
I belive this should work:
while(1)
{
int retval = select(0, NULL, NULL, NULL, &tv, &mask);
if (retval == -1 && errno == EINTR) // some signal
{
pid_t pid = (waitpid(-1, &st, WNOHANG) == 0);
if (pid != 0) // some child signaled
}
else if (retval == 0)
{
// timeout
break;
}
else // error
}
Note: you can use pselect to override current sigmask and avoid interrupts from unneeded signals.
Instead of calling waitpid() directly, you could call sigtimedwait() with SIGCHLD (which would be sended to the parent process after child exited) and wait it be delived to the current thread, just as the function name suggested, a timeout parameter is supported.
please check the following code snippet for detail
static bool waitpid_with_timeout(pid_t pid, int timeout_ms, int* status) {
sigset_t child_mask, old_mask;
sigemptyset(&child_mask);
sigaddset(&child_mask, SIGCHLD);
if (sigprocmask(SIG_BLOCK, &child_mask, &old_mask) == -1) {
printf("*** sigprocmask failed: %s\n", strerror(errno));
return false;
}
timespec ts;
ts.tv_sec = MSEC_TO_SEC(timeout_ms);
ts.tv_nsec = (timeout_ms % 1000) * 1000000;
int ret = TEMP_FAILURE_RETRY(sigtimedwait(&child_mask, NULL, &ts));
int saved_errno = errno;
// Set the signals back the way they were.
if (sigprocmask(SIG_SETMASK, &old_mask, NULL) == -1) {
printf("*** sigprocmask failed: %s\n", strerror(errno));
if (ret == 0) {
return false;
}
}
if (ret == -1) {
errno = saved_errno;
if (errno == EAGAIN) {
errno = ETIMEDOUT;
} else {
printf("*** sigtimedwait failed: %s\n", strerror(errno));
}
return false;
}
pid_t child_pid = waitpid(pid, status, WNOHANG);
if (child_pid != pid) {
if (child_pid != -1) {
printf("*** Waiting for pid %d, got pid %d instead\n", pid, child_pid);
} else {
printf("*** waitpid failed: %s\n", strerror(errno));
}
return false;
}
return true;
}
Refer: https://android.googlesource.com/platform/frameworks/native/+/master/cmds/dumpstate/DumpstateUtil.cpp#46
If you're going to use signals anyways (as per Steve's suggestion), you can just send the signal manually when you want to exit. This will cause waitpid to return EINTR and the thread can then exit. No need for a periodic alarm/restart.
Due to circumstances I absolutely needed this to run in the main thread and it was not very simple to use the self-pipe trick or eventfd because my epoll loop was running in another thread. So I came up with this by scrounging together other stack overflow handlers. Note that in general it's much safer to do this in other ways but this is simple. If anyone cares to comment about how it's really really bad then I'm all ears.
NOTE: It is absolutely necessary to block signals handling in any thread save for the one you want to run this in. I do this by default as I believe it messy to handle signals in random threads.
static void ctlWaitPidTimeout(pid_t child, useconds_t usec, int *timedOut) {
int rc = -1;
static pthread_mutex_t alarmMutex = PTHREAD_MUTEX_INITIALIZER;
TRACE("ctlWaitPidTimeout: waiting on %lu\n", (unsigned long) child);
/**
* paranoid, in case this was called twice in a row by different
* threads, which could quickly turn very messy.
*/
pthread_mutex_lock(&alarmMutex);
/* set the alarm handler */
struct sigaction alarmSigaction;
struct sigaction oldSigaction;
sigemptyset(&alarmSigaction.sa_mask);
alarmSigaction.sa_flags = 0;
alarmSigaction.sa_handler = ctlAlarmSignalHandler;
sigaction(SIGALRM, &alarmSigaction, &oldSigaction);
/* set alarm, because no alarm is fired when the first argument is 0, 1 is used instead */
ualarm((usec == 0) ? 1 : usec, 0);
/* wait for the child we just killed */
rc = waitpid(child, NULL, 0);
/* if errno == EINTR, the alarm went off, set timedOut to true */
*timedOut = (rc == -1 && errno == EINTR);
/* in case we did not time out, unset the current alarm so it doesn't bother us later */
ualarm(0, 0);
/* restore old signal action */
sigaction(SIGALRM, &oldSigaction, NULL);
pthread_mutex_unlock(&alarmMutex);
TRACE("ctlWaitPidTimeout: timeout wait done, rc = %d, error = '%s'\n", rc, (rc == -1) ? strerror(errno) : "none");
}
static void ctlAlarmSignalHandler(int s) {
TRACE("ctlAlarmSignalHandler: alarm occured, %d\n", s);
}
EDIT: I've since transitioned to using a solution that integrates well with my existing epoll()-based eventloop, using timerfd. I don't really lose any platform-independence since I was using epoll anyway, and I gain extra sleep because I know the unholy combination of multi-threading and UNIX signals won't hurt my program again.
I can use a signal handler for SIGCHLD, and in the signal handler do whatever I was going to do when a child exits, or send a message to a different thread to do some action. But using a signal handler obfuscates the flow of the code a little bit.
In order to avoid race conditions you should avoid doing anything more complex than changing a volatile flag in a signal handler.
I think the best option in your case is to send a signal to the parent. waitpid() will then set errno to EINTR and return. At this point you check for waitpid return value and errno, notice you have been sent a signal and take appropriate action.
If a third party library is acceptable then the libkqueue project emulates kqueue (the *BSD eventing system) and provides basic process monitoring with EVFILT_PROC + NOTE_EXIT.
The main advantages of using kqueue or libkqueue is that it's cross platform, and doesn't have the complexity of signal handling. If your program is utilises async I/O you may also find it a lower friction interface than using something like epoll and the various *fd functions (signalfd, eventfd, pidfd etc...).
#include <stdio.h>
#include <stdint.h>
#include <sys/event.h> /* kqueue header */
#include <sys/types.h> /* for pid_t */
/* Link with -lkqueue */
int waitpid_timeout(pid_t pid, struct timespec *timeout)
{
struct kevent changelist, eventlist;
int kq, ret;
/* Populate a changelist entry (an event we want to be notified of) */
EV_SET(&changelist, pid, EVFILT_PROC, EV_ADD, NOTE_EXIT, 0, NULL);
kq = kqueue();
/* Call kevent with a timeout */
ret = kevent(kq, &changelist, 1, &eventlist, 1, timeout);
/* Kevent returns 0 on timeout, the number of events that occurred, or -1 on error */
switch (ret) {
case -1:
printf("Error %s\n", strerror(errno));
break;
case 0:
printf("Timeout\n");
break;
case 1:
printf("PID %u exited, status %u\n", (unsigned int)eventlist.ident, (unsigned int)eventlist.data);
break;
}
close(kq);
return ret;
}
Behind the scenes on Linux libkqueue uses either pidfd on Linux kernels >= 5.3 or a waiter thread that listens for SIGCHLD and notifies one or more kqueue instances when a process exits. The second approach is not efficient (it scans PIDs that interest has been registered for using waitid), but that doesn't matter unless you're waiting on large numbers of PIDs.
EVFILT_PROC support has been included in kqueue since its inception, and in libkqueue since v2.5.0.