The normal implementations of a work queue I have seen involve mutexes and condition variables.
Consumer:
A) Acquires Lock
B) While Queue empty
Wait on Condition Variable (thus suspending thread and releasing lock)
C) Work object retrieved from queue
D) Lock is released
E) Do Work
F) GOTO A
Producer:
A) Acquires Lock
B) Work is added to queue
C) condition variable is signaled (potentially releasing worker)
D) Lock is released
I have been browsing some code and I saw an implementation using POSIX pipes (I have not seen this technique before).
Consumer:
A) Do select on pipe (thus suspending thread while no work)
B) Get Job from pipe
C) Do Work
D) GOTO A
Producer:
A) Write Job to pipe.
Since the producer and consumer are threads inside the same application (thus they share the same address space and thus pointers between them are valid); the jobs are written to the pipe as the address of the work object (A C++ object). So all that has to be written/read from the pipe is an 8-byte address.
My question is:
Is this a common technique (have I been sheltered from this) and what are the advantages/disadvantages?
My curiosity was piqued because the pipe technique does not involve any visible lock or signals (it may be hidden in the select). So I was wondering if this would be more efficient?
Edit:
Based on comments in #Maxim Yegorushkin answer.
Actually the "Producer" in this scenario is involved in a lot of high volume IO from lots of source in parallel. So I suspect that the original author though it very desirable that this thread did not block under any circumstances, but also did not want to high cost work in the "Producer" thread.
As it's been mentioned here already, people use pipes as queues to avoid blocking on a condition variable in a non-blocking I/O thread (i.e. the thread that handles multiple sockets and blocks on select/epoll). If an I/O thread blocks on a condition variable or a mutex it can't do non-blocking I/O any more.
Some say that writing into a pipe involves a system call and may increase latency when the volume of inter-thread events is high. That is only true for naive pipe-based queue implementations.
Advanced implementations use lock-free linked lists of jobs/events and only when the first job is added to the list the pipe is written to to wake the target I/O thread from the blocking epoll call (essentially using pipe as an edge-triggered notification mechanism but not for passing pointers to jobs/events). Because it takes a few micro-seconds to wake up a thread there may be more jobs/events posted to that thread's event queue during this time but every subsequent event doesn't require writing to the pipe, until later time when the I/O thread wakes up and consumes all events in the queue. Also, in newer Linux kernel a faster eventfd can be used instead of pipe to wake up an I/O thread.
I have done this. It's old-school but it works.
The reason I did it this way was I needed to wake up the same thread on either a job for it to do or read input from another source, so select() was involved.
It is because of select and how it is structured. As you can see in the man page
select() and pselect() allow a program to monitor multiple file descriptors, waiting until one or more of the file descriptors become "ready" for some class of I/O operation (e.g., input possible). A file descriptor is considered ready if it is possible to perform the corresponding I/O operation (e.g., read(2)) without blocking.
The key in the above is the 'waiting until one or more of the FDs become ready'. That is the synchronization point between the two threads.
I think the answer is that the pipe technique does not give as good performance as it involves system calls, which are relatively expensive. But it does mean that all that tricky locking and sleeping and waking gets taken care of for you.
I've used both myself, but pipes only for occasional non performance critical applications.
EDIT: I suppose I might as well make the standard recommendation since nobody has come along with any clearly authoritative comments.
Standard recommendation being: Try both and benchmark them. It's the one true way to find out which performs better...
Related
I'm working on a multi-thread scheduling assignment, which involves adding threads to a variety of queues and selecting the appropriate one to execute.
The pthread_cond_signal(&condition) command is completely asynchronous from what I can tell; it's simply thrown into memory and the first thread to find it with the appropriate pthread_cond_wait() will consume it.
However, say I have a vector of thread ids that have been pushed as the thread is created, ie:
threadIDVector1[0] = 3061099328
threadIDVector1[1] = 3077884736
...
threadIDVector2[0] = 3294747394
threadIDVector2[1] = 3384567393
...
etc.
And I wanted to send a signal specifically to the thread with an id that matches the appropriate element of a vector. I.e. the algorithm would be:
While (at least one threadVector is non-empty):
Look at the first element in each vector
Select the appropriate one to signal by some criteria
Send a signal to ONLY that thread
Complete the thread and remove from threadIDVectorX
Is there some way to execute the above, or some accepted standard for achieving the same result?
There is no way to "send" a signal to a specific thread, nor to know which thread among many will be woken by the OS. It is entirely non-deterministic.
You could use the "multiple condition variable" solution as proposed in the comments. But my preferred solution to something like this is a pipe or socket pair. Have the thread doing the waking write something (like a single byte) to the pipe for the corresponding thread to signal it.
This has a lot of benefits in my book. First, it allows bidirectional communication. Your pseudocode loop at the end of your question seems to also want to remove a finished thread from the list, so you need to know when that thread is done. You could have another CV, or you could have the completing thread write a single byte back to the manager object before exiting. Much easier, I feel.
It also allows you to choose between blocking or nonblocking I/O, or to use synchronous multiplexing with select(2) or epoll(2). If you were not exiting from the worker threads, but instead wanted to reuse them, the notifying thread would need to know when they're ready for more work. Again, a CV would be fine here, but the file-descriptor approach allows the notifier to wait for all of the worker threads in a single select(2) call.
The last thing is that I find files simpler. pthreads are pretty complicated, and multithreading is already hard enough to get right. I find that files are easier to manage and reason about in a multithreaded context, making it easier to avoid locking or crashes.
This is similar but a bit different to existing questions. Say I have many threads that open the same file but they all do their own fopen and maintain their own FILE pointer.
a) is it necessary to lock fwrite calls if they have their own FILE ptrs?
b) if it is necessary, is locking around fwrite enough or will they potentially flush at different times and end up intermingling when they flush? If yes, would locking on fwrite and then fflush cover it?
This question can not be answered in the context of programming languages. As far as programming language is concerned, those file handles are completely independent objects, and whatever you do with one has no effect whatsoever on another.
The question is on the operating system - can it handle multiple write operation to the same underlying file at the same time. In other words, are those writes atomic. I can't say for all of them, but in Linux, for example, writes for less than PIPE_BUF size are atomic.
For the quick measure, yeah, you can put a lock around the I/O part. That'd work, I guarantee it. As for flusing I/O cache, I'd recommend not doing that. It's always best to let OS to handle I/O timing because kernel knows what's going on the best. You are not gonna have it in effect immediately after calling flush anyway because it's that complicated. Just like the other flush operations(java GC, glFlush and so on). If you choose to stick to this option, please be mindful of a start and an end point of the concurrent I/O op. You wouldn't want a case where the main thread closes the file and another worker thread tries to do I/O on that.
The general solution to this problem is creating a thread that handles the file exclusively. If other thread should read/write from/to the file, they must ask the thread to do that for them. This is tricky, I know. You'd need to compose a simple protocol, sync mechanism, but in a nutshell, it goes like this:
prep a queue, a cv(condition variable), a lock. create a thread and open the file. Doesn't matter who opens the file
The thread spawns and waits for the queue to be filled in
Other threads send a request I/O op to the thread. The request includes the data for the file and an op code.
The thread handles the requests from the queue. This is where the real I/O happens.
You could use anonymous FIFO instead of a queue. Or skip the opcode part if the file is write-only.
Unlike network I/O, modern OSes can't do file I/Os in a non-blocking manner. So expect a significant blocking time(io wait). Also, there's this problem where the queue fills up too quick and eats a lot of memory when I/O is relatively slow. There will be a case where the whole program should wait for the I/O to complete before terminating itself. Not much you can do about it. You could close the file from another thread while I/O is in progress on Linux(close() is MT-safe ), I don't know how that's gonna work on other OS.
There are alternatives like async file I/O or overlapped I/O which involves signal handling or callbacks. Using these doesn't require a creating of a thread but each has pros and cons, mostly regarding portability.
I'm looking for something could be used for polling (like select, kqueue, epoll i.e. not busy polling) in C/C++. In other word, I need to block a thread, and then wake it up in another thread with as little overhead as possible.
A mutex + condition variable works, but there is a lot of overhead. A futex also works, but that's for Linux only (or maybe not?). Extra synchronization is not required as long as the polling itself works properly, e.g. no race when I call wait and wake in two threads.
Edit: If such a "facility" doesn't exist in FreeBSD, how to create one with C++11 built-in types and system calls?
Edit2: Since this question is migrated to SO, I'd like to make it more general (not for FreeBSD only)
semaphores are not mutexes, and would work with slightly less overhead (avoiding the mutex+condvar re-lock, for example)
Note that since any solution where a thread sleeps until woken will involve a kernel syscall, it still isn't cheap. Assuming x86_64 glibc and the FreeBSD libc are both reasonable implementations, the unavoidable cost seems to be:
user-mode synchronisation of the count (with a CAS or similar)
kernel management of the wait queue and thread sleep/wait
I assume the mutex + condvar overhead you're worried about is the cond_wait->re-lock->unlock sequence, which is indeed avoided here.
You want semaphores and not mutexes for the signaling between the to threads..
http://man7.org/linux/man-pages/man3/sem_wait.3.html
Semaphores can be used like a counter such as if you have a queue, you increment (post) to the semaphore every time you insert a message, and your receiver decrement (wait) on the semaphore for every message it takes out. If the counter reach zero the receiver will block until something is posted.
So a typical pattern is to combine a mutex and a semaphore like;
sender:
mutex.lock
insert message in shared queue
mutex.unlock
semaphore.post
receiver:
semaphore.wait
mutex.lock
dequeue message from shared structure
mutex.unlock
I'm writing a POSIX compatible multi-threaded server in c/c++ that must be able to accept, read from, and write to a large number of connections asynchronously. The server has several worker threads which perform tasks and occasionally (and unpredictably) queue data to be written to the sockets. Data is also occasionally (and unpredictably) written to the sockets by the clients, so the server must also read asynchronously. One obvious way of doing this is to give each connection a thread which reads and writes from/to its socket; this is ugly, though, since each connection may persist for a long time and the server thus may have to hold hundred or thousand threads just to keep track of connections.
A better approach would be to have a single thread that handled all communications using the select()/pselect() functions. I.e., a single thread waits on any socket to be readable, then spawns a job to process the input that will be handled by a pool of other threads whenever input is available. Whenever the other worker threads produce output for a connection, it gets queued, and the communication thread waits for that socket to be writable before writing it.
The problem with this is that the communication thread may be waiting in the select() or pselect() function when output is queued by the worker threads of the server. It's possible that, if no input arrives for several seconds or minutes, a queued chunk of output will just wait for the communication thread to be done select()ing. This shouldn't happen, however--data should be written as soon as possible.
Right now I see a couple solutions to this that are thread-safe. One is to have the communication thread busy-wait on input and update the list of sockets it waits on for writing every tenth of a second or so. This isn't optimal since it involves busy-waiting, but it will work. Another option is to use pselect() and send the USR1 signal (or something equivalent) whenever new output has been queued, allowing the communication thread to update the list of sockets it is waiting on for writable status immediately. I prefer the latter here, but still dislike using a signal for something that should be a condition (pthread_cond_t). Yet another option would be to include, in the list of file descriptors on which select() is waiting, a dummy file that we write a single byte to whenever a socket needs to be added to the writable fd_set for select(); this would wake up the communications server because that particular dummy file would then be readable, thus allowing the communications thread to immediately update it's writable fd_set.
I feel intuitively, that the second approach (with the signal) is the 'most correct' way to program the server, but I'm curious if anyone knows either which of the above is the most efficient, generally speaking, whether either of the above will cause race conditions that I'm not aware of, or if anyone knows of a more general solution to this problem. What I really want is a pthread_cond_wait_and_select() function that allows the comm thread to wait on both a change in sockets or a signal from a condition.
Thanks in advance.
This is a fairly common problem.
One often used solution is to have pipes as a communication mechanism from worker threads back to the I/O thread. Having completed its task a worker thread writes the pointer to the result into the pipe. The I/O thread waits on the read end of the pipe along with other sockets and file descriptors and once the pipe is ready for read it wakes up, retrieves the pointer to the result and proceeds with pushing the result into the client connection in non-blocking mode.
Note, that since pipe reads and writes of less then or equal to PIPE_BUF are atomic, the pointers get written and read in one shot. One can even have multiple worker threads writing pointers into the same pipe because of the atomicity guarantee.
Unfortunately, the best way to do this is different for each platform. The canonical, portable way to do it is to have your I/O thread block in poll. If you need to get the I/O thread to leave poll, you send a single byte on a pipe that the thread is polling. That will cause the thread to exit from poll immediately.
On Linux, epoll is the best way. On BSD-derived operating systems (including OSX, I think), kqueue. On Solaris, it used to be /dev/poll and there's something else now whose name I forget.
You may just want to consider using a library like libevent or Boost.Asio. They give you the best I/O model on each platform they support.
Your second approach is the cleaner way to go. It's totally normal to have things like select or epoll include custom events in your list. This is what we do on my current project to handle such events. We also use timers (on Linux timerfd_create) for periodic events.
On Linux the eventfd lets you create such arbitrary user events for this purpose -- thus I'd say it is quite accepted practice. For POSIX only functions, well, hmm, perhaps one of the pipe commands or socketpair I've also seen.
Busy-polling is not a good option. First you'll be scanning memory which will be used by other threads, thus causing CPU memory contention. Secondly you'll always have to return to your select call which will create a huge number of system calls and context switches which will hurt overall system performance.
I'm working on a project, where a primary server thread needs to dispatch events to a series of worker threads. The work that goes on in the worker threads relies on polling (ie. epoll or kqueue depending on the UNIX system in question) with timeouts on these operations needing to be handles. This means, that a normal conditional variable or semaphore structure is not viable for this dispatch, as it would make one or the other block resulting in an unwanted latency between either handling the events coming from polling or the events originating from the server thread.
So, I'm wondering what the most optimal construct for dispatching such events between threads in a pollable fashion is? Essentially, all that needs to be delivered is a pollable "signal" that tells the worker thread, that it has more events to fetch. I've looked at using UNIX pipes (unnamed ones, as it's internal to the process) which seems like a decent solution given that a single byte can be written to the pipe and read back out when the queue is cleared -- but, I'm wondering if this is the best approach available? Or the fastest?
Alternatively, there is the possibility to use signalfd(2) on Linux, but as this is not available on BSD systems, I'd rather like to avoid this construct. I'm also wondering how great the overhead in using system signals actually is?
Jan Hudec's answer is correct, although I wouldn't recommend using signals for a few reasons:
Older versions of glibc emulated pselect and ppoll in a non-atomic fashion, making them basically worthless. Even when you used the mask correctly, signals could get "lost" between the pthread_sigprocmask and select calls, meaning they don't cause EINTR.
I'm not sure signalfd is any more efficient than the pipe. (Haven't tested it, but I don't have any particular reason to believe it is.)
signals are generally a pain to get right. I've spent a lot of effort on them (see my sigsafe library) and I'd recommend avoiding them if you can.
Since you're trying to have asynchronous handling portable to several systems, I'd recommend looking at libevent. It will abstract epoll or kqueue for you, and it will even wake up workers on your behalf when you add a new event. See event.c
2058 static inline int
2059 event_add_internal(struct event *ev, const struct timeval *tv,
2060 int tv_is_absolute)
2061 {
...
2189 /* if we are not in the right thread, we need to wake up the loop */
2190 if (res != -1 && notify && EVBASE_NEED_NOTIFY(base))
2191 evthread_notify_base(base);
...
2196 }
Also,
The worker thread deals with both socket I/O and asynchronous disk I/O, which means that it is optimally always waiting for the event queuing mechanism (epoll/kqueue).
You're likely to be disappointed here. These event queueing mechanisms don't really support asynchronous disk I/O. See this recent thread for more details.
As far as performance goes, the cost of system call is comparably huge to other operations, so it's the number of system calls that matters. There are two options:
Use the pipes as you wrote. If you have any useful payload for the message, you get one system call to send, one system call to wait and one system call to receive. Try to pass any relevant data down the pipe instead of reading them from a shared structure to avoid additional overhead from locking.
The select and poll have variants, that also waits for signals (pselect, ppoll). Linux epoll can do the same using signalfd, so it remains a question whether kqueue can wait for signals, which I don't know. If it can, than you could use them (you are using different mechanism on Linux and *BSD anyway). It would save you the syscall for reading if you don't have good use for the passed data.
I would expect passing the data over socket to be more efficient if it allows you do do away with any other locking.