Using a mutex semaphore between two tasks - c++

I am trying to understand just how to use semaphores in general. I have chosen a mutex type semaphore instead of a binary in order to maintain task priorities. The synopsis is:
I have two tasks, WebTask and DBTask.
WebTask is a simple task that takes (can you guess) web requests and does something with them.
DBTask is a (you got it) database task that runs on a messaging queue type system, so in order to work with it, you must send a message to its queue with n arguments.
Here is how things need to go down
WebTask receives an input and wants to get data from DBTask.
WebTask constructs a message for DBTask and sends it along the main messaging system with DBTask's id and all arguments
WebTask must wait for DBTask to
Receive the message
Talk to the database
Format the result
Without semaphores, here's what is happening
WebTask constructs and sends msg
WebTask continues on
DBTask gets msg
DBTask talks to database, gets data, etc
So obviously they are out of sync (should be 1,3,4,2)
This is just one thing I tried that I thought would work..
WebTask create a new mutex semaphore
WebTask constructs and sends a msg, while also telling DBTask about the semaphore ID
WebTask takes the semaphore
DBTask gets msg
DBTask processes msg
DBTasks gives back semaphore
This changed nothing.
What I don't understand at all (after many attempts).
Who should be in charge of the semaphore? Who should create it?
When is the semaphore taken?
When is the semaphore given?
(Notes: I am on VxWorks)

If you have to do your tasks in such a synchronous manner, it's easy enough.
The web task creates a semaphore at startup. The web task gets data, loads it into some request/response object instance, along with its semaphore reference/pointer. It fires it off to the queue and waits on the semaphore. The DB task gets the object, reads the request, processes it, loads the response field/s with results, (or error messages:), and signals the semaphore. The web task then runs on, knowing that the response is available.
That way, several web tasks can use the DB and it's input queue.
Forget the mutex stuff, you don't need any mutex, (well, there may well be one protecting the DB input producer-consumer queue but, presumably, that is an internal matter).

In WebTask, after sending the msg to DBTask, it waits on a semephore (semTake) while in DBTask, after processing the Database request and data formating, it release a semephore(semGive). I think that should make your 2 tasks synchronized.

Related

How to implement long running gRPC async streaming data updates in C++ server

I'm creating an async gRPC server in C++. One of the methods streams data from the server to clients - it's used to send data updates to clients. The frequency of the data updates isn't predictable. They could be nearly continuous or as infrequent as once per hour. The model used in the gRPC example with the "CallData" class and the CREATE/PROCESS/FINISH states doesn't seem like it would work very well for that. I've seen an example that shows how to create a 'polling' loop that sleeps for some time and then wakes up to check for new data, but that doesn't seem very efficient.
Is there another way to do this? If I use the "CallData" method can it block in the 'PROCESS' state until there's data (which probably wouldn't be my first choice)? Or better, can I structure my code so I can notify a gRPC handler when data is available?
Any ideas or examples would be appreciated.
In a server-side streaming example, you probably need more states, because you need to track whether there is currently a write already in progress. I would add two states, one called WRITE_PENDING that is used when a write is in progress, and another called WRITABLE that is used when a new message can be sent immediately. When a new message is produced, if you are in state WRITABLE, you can send immediately and go into state WRITE_PENDING, but if you are in state WRITE_PENDING, then the newly produced message needs to go into a queue to be sent after the current write finishes. When a write finishes, if the queue is non-empty, you can grab the next message from the queue and immediately start a write for it; otherwise, you can just go into state WRITABLE and wait for another message to be produced.
There should be no need to block here, and you probably don't want to do that anyway, because it would tie up a thread that should otherwise be polling the completion queue. If all of your threads wind up blocked that way, you will be blind to new events (such as new calls coming in).
An alternative here would be to use the C++ sync API, which is much easier to use. In that case, you can simply write straight-line blocking code. But the cost is that it creates one thread on the server for each in-progress call, so it may not be feasible, depending on the amount of traffic you're handling.
I hope this information is helpful!

Using thread class to manage sending/receiving messages

I have a class that I'm using to manage creation of and destruction of threads that are responsible for sending and receiving CAN messages. I don't know if this is the best way to go about it, so I'm looking for advice on how to manage my threads for send messages and receive messages.
Basically I want spawnThread() to spawn a thread for the object passed to it.
so, something to the effect of
spawnThread(T obj)
{
std::thread (&T::obj, this);
}
My expectation was that I would use the Thread class to manage starting and ending the thread for two separate classes SendMessage and ReceiveMessage. Is there a better way to handle threading for sending and receive messages?
There is no such thing as a Manager pattern in OO. A thread either does a continuous job like waiting for connections or does a single shot job. The last type typically are worker-threads that are reused.
Now coming to the question. Threading does not scale. If you have send/receive tasks, process them in a fixed-size worker-thread pool. As a consequence, your application will be react slower if the workload extends your pool size as new request have to wait for a worker, but it will continue to work.

Send and receive data by tcp socket in one thread

I have an applications with 2 threads. The first thread (main-thread) and the second thread (tcp-client-thread). The main-thread generates some messages and puts their in queue for tcp-client-thread. tcp-client-thread has to send those messages to server. But, tcp-client-thread also has to receive some messages from server.
How can I do that? recv stops current thread. Set up timeout forrecv? Then after recv timeout check queue (from main-thread) and if there is messages send their is no any messages start recv again?
You can do your I/O in one non-spinning/non-delayed thread but it's much more complex then just simply creating another thread as suggested in another answer. In short, you'll have to modify your code to handle waiting for multiple event types simultaneously, i.e. an event on the socket OR on a condition signalling data to send, for example. On Windows, you'd use something like WSAEventSelect + WaitForMultipleObjects instead of select, and on Linux you'd use something like eventfd with select. Note that when handling the socket, if it's blocking, you'd want to check for readability before issuing a recv and check for writeability before issuing a send so you don't block on one or the other. Like I said though, easier to just create a send thread...
The thing you need is non-blocking/asynchronous I/O.
You should read some theory before trying to forge any code.
This article, for example:
http://www.wangafu.net/~nickm/libevent-book/01_intro.html
If you are going to use 2 threads, you might want to extend to 3 threads. Let the send and receive functions be on separate threads.
The send thread is sleeping until the main thread gives it data. Specifically, a function in the send software unit places data into the queue, then signals the thread to wake up. The thread wakes up and sends data until the queue is empty, then it goes back to sleep.
Conversely, the receive thread sleeps until it receives data. It appends data to another queue, notifies the main thread that data was received and goes back to sleep.
Edit 1: One Thread
Per your title, if you want to perform the I/O in one thread, you will need to have a polling loop (you can have limited waiting, but not advised).
Loop:
if (data received) then place data into input queue.
if (data in input queue) process some data (use small chunks).
if (data in output queue) send some data.
end-loop.
The idea is to keep the blocks of data small to prevent missing of incoming data. The data can be processed and output when there is no data (and with multiple iterations). This will resolve most synchronization issues.

Help needed for implementing a thread Monitoring Mechanism

I am working on a multithreaded middleware enviornment. The framework is basically a capturing and streaming framework. So it involves a number of threads.
To give you all a brief idea of the threading architecture:
There are seprate threads for demultiplexer, receiveVideo, DecodeVideo, DisplayVideo etc. Each thread performs its functionlity, for eg:
demultiplexer extracts audio, video packets
receivevideo receives header + payload of video packet & removes payload
DecodeVideo receives payload & decodes payload packet
DisplayVideo receives decoded packets & displays the decoded packets on display
Thus each thread feeds the extracted data to the next thread. The threads share data buffers amongst them and the buffers are synchronised through use of mutexes and semaphores. Similarly, there are other threads for handling ananlogvideo and analogaudio etc.
All the threads are spawned in during initialization but they remain blocked on a semaphore and depending upon the input(analog/digitial) selective semaphores are signalled so that specifc threads get unblocked & move on to do their work. At various stages each thread calls some lower level(driver calls)to get data or write data etc. These calls are blocking and the errors resulting from these calls(driver returning corrupted data, driver stalling) should be handled but are not being handled currently.
I wanted to implement a thread monitoring mechanism where a thread will monitor these worker threads and if an error condition occurs will take some preventive actions. As I understand certain such mechanisms are commonly used like Watchdogs in UI or MMI applications. I am trying to look for something similar.
I am using pthreads and No Boost or STL(its a legacy code, pretty much procedural C++)
Any ideas about specific framework or design patterns or open source projects which do something similar and might help in with ideas for implementing my requirement?
Can you ping the threads - periodically send each one a message on its usual input queue, interleaved with all the other normal stuff, asking it to return its status? When each handler thread gets the message, it loads the message with status stuff - how many messages its processed since the last ping, length of its input/output queue, last time that its driver returned OK, that sort of stats - and queues it back to your Thread Monitoring Mechanism. Your TMM would have to time out the replies in case some thread/s is/are stuck.
You could, maybe, just post one message down the whole chain, each thread adding its own status in different fields. That would mean only one timeout, after which your TMM would have to examine the message to see how far down the chain it got.
There are other things - I like to keep an on-screen dump, on a 1s timer, of the length of queues and depth of buffer pools. If something stuffs, I can usually tell roughly where it is, (eg. a pool is emptying and some queue is growing - the queue comsumer is wasted).
Rgds,
Martin
What about using a signalling system to wake up your monitoring thread when something's gone awry in one of your worker threads. You can emulate the signalling with an ResetEvent of some type.
When an exception occurs in your worker thread, you have some data structure you fill up with the data about the exception and then you can pass that on to your monitoring thread. You wake up the monitoring thread by using the event.
Then the monitoring thread can do what you need it to do.
I'm guessing you don't wish to have your monitoring thread active unless something has gone wrong, right?

I want to wait on both a file descriptor and a mutex, what's the recommended way to do this?

I would like to spawn off threads to perform certain tasks, and use a thread-safe queue to communicate with them. I would also like to be doing IO to a variety of file descriptors while I'm waiting.
What's the recommended way to accomplish this? Do I have to created an inter-thread pipe and write to it when the queue goes from no elements to some elements? Isn't there a better way?
And if I have to create the inter-thread pipe, why don't more libraries that implement shared queues allow you to create the shared queue and inter-thread pipe as a single entity?
Does the fact I want to do this at all imply a fundamental design flaw?
I'm asking this about both C++ and Python. And I'm mildly interested in a cross-platform solution, but primarily interested in Linux.
For a more concrete example...
I have some code which will be searching for stuff in a filesystem tree. I have several communications channels open to the outside world through sockets. Requests that may (or may not) result in a need to search for stuff in the filesystem tree will be arriving.
I'm going to isolate the code that searches for stuff in the filesystem tree in one or more threads. I would like to take requests that result in a need to search the tree and put them in a thread-safe queue of things to be done by the searcher threads. The results will be put into a queue of completed searches.
I would like to be able to service all the non-search requests quickly while the searches are going on. I would like to be able to act on the search results in a timely fashion.
Servicing the incoming requests would generally imply some kind of event-driven architecture that uses epoll. The queue of disk-search requests and the return queue of results would imply a thread-safe queue that uses mutexes or semaphores to implement the thread safety.
The standard way to wait on an empty queue is to use a condition variable. But that won't work if I need to service other requests while I'm waiting. Either I end up polling the results queue all the time (and delaying the results by half the poll interval, on average), blocking and not servicing requests.
Whenever one uses an event driven architecture, one is required to have a single mechanism to report event completion. On Linux, if one is using files, one is required to use something from the select or poll family meaning that one is stuck with using a pipe to initiate all none file related events.
Edit: Linux has eventfd and timerfd. These can be added to your epoll list and used to break out of the epoll_wait when either triggered from another thread or on a timer event respectively.
There is another option and that is signals. One can use fcntl modify the file descriptor such that a signal is emitted when the file descriptor becomes active. The signal handler may then push a file-ready message onto any type of queue of your choosing. This may be a simple semaphore or mutex/condvar driven queue. Since one is now no longer using select/poll, one no longer needs to use a pipe to queue none file based messages.
Health warning: I have not tried this and although I cannot see why it will not work, I don't really know the performance implications of the signal approach.
Edit: Manipulating a mutex in a signal handler is probably a very bad idea.
I've solved this exact problem using what you mention, pipe() and libevent (which wraps epoll). The worker thread writes a byte to its pipe FD when its output queue goes from empty to non-empty. That wakes up the main IO thread, which can then grab the worker thread's output. This works great is actually very simple to code.
You have the Linux tag so I am going to throw this out: POSIX Message Queues do all this, which should fulfill your "built-in" request if not your less desired cross-platform wish.
The thread-safe synchronization is built-in. You can have your worker threads block on read of the queue. Alternatively MQs can use mq_notify() to spawn a new thread (or signal an existing one) when there is a new item put in the queue. And since it looks like you are going to be using select(), MQ's identifier (mqd_t) can be used as a file descriptor with select.
It seems nobody has mentioned this option yet:
Don't run select/poll/etc. in your "main thread". Start a dedicated secondary thread which does the I/O and pushes notifications into your thread-safe queue (the same queue which your other threads use to communicate with the main thread) when I/O operations complete.
Then your main thread just needs to wait on the notification queue.
Duck's and twk's are actually better answers than doron's (the one selected by the OP), in my opinion. doron suggests writing to a message queue from within the context of a signal handler, and states that the message queue can be "any type of queue." I would strongly caution you against this since many C library/system calls cannot safely be called from within a signal handler (see async-signal-safe).
In particuliar, if you choose a queue protected by a mutex, you should not access it from a signal handler. Consider this scenario: your consumer thread locks the queue to read it. Immediately after, the kernel delivers the signal to notify you that a file descriptor now has data on it. You signal handler runs in the consumer thread, necessarily), and tries to put something on your queue. To do this, it first has to take the lock. But it already holds the lock, so you are now deadlocked.
select/poll is, in my experience, the only viable solution to an event-driven program in UNIX/Linux. I wish there were a better way inside a mutlithreaded program, but you need some mechanism to "wake up" your consumer thread. I have yet to find a method that does not involve a system call (since the consumer thread is on a waitqueue inside the kernel during any blocking call such as select).
EDIT: I forgot to mention one Linux-specific way to handle signals when using select/poll: signalfd(2). You get a file descriptor you can select/poll on, and you handling code runs normally instead of in a signal handler's context.
This is a very common seen problem, especially when you are developing network server-side program. Most Linux server-side program's main look will loop like this:
epoll_add(serv_sock);
while(1){
ret = epoll_wait();
foreach(ret as fd){
req = fd.read();
resp = proc(req);
fd.send(resp);
}
}
It is single threaded(the main thread), epoll based server framework. The problem is, it is single threaded, not multi-threaded. It requires that proc() should never blocks or runs for a significant time(say 10 ms for common cases).
If proc() will ever runs for a long time, WE NEED MULTI THREADS, and executes proc() in a separated thread(the worker thread).
We can submit task to the worker thread without blocking the main thread, using a mutex based message queue, it is fast enough.
epoll_add(serv_sock);
while(1){
ret = epoll_wait();
foreach(ret as fd){
req = fd.read();
queue.add_job(req); // fast, non blockable
}
}
Then we need a way to obtain the task result from a worker thread. How? If we just check the message queue directly, before or after epoll_wait().
epoll_add(serv_sock);
while(1){
ret = epoll_wait(); // may blocks for 10ms
resp = queue.check_result(); // fast, non blockable
foreach(ret as fd){
req = fd.read();
queue.add_job(req); // fast, non blockable
}
}
However, the checking action will execute after epoll_wait() to end, and epoll_wait() usually blocks for 10 micro seconds(common cases) if all file descriptors it waits are not active.
For a server, 10 ms is quite a long time! Can we signal epoll_wait() to end immediately when task result is generated?
Yes! I will describe how it is done in one of my open source project:
Create a pipe for all worker threads, and epoll waits on that pipe as well. Once a task result is generated, the worker thread writes one byte into the pipe, then epoll_wait() will end in nearly the same time! - Linux pipe has 5 us to 20 us latency.
In my project SSDB(a Redis protocol compatible in-disk NoSQL database), I create a SelectableQueue for passing messages between the main thread and worker threads. Just like its name, SelectableQueue has an file descriptor, which can be wait by epoll.
SelectableQueue: https://github.com/ideawu/ssdb/blob/master/src/util/thread.h#L94
Usage in main thread:
epoll_add(serv_sock);
epoll_add(queue->fd());
while(1){
ret = epoll_wait();
foreach(ret as fd){
if(fd is queue){
sock, resp = queue->pop_result();
sock.send(resp);
}
if(fd is client_socket){
req = fd.read();
queue->add_task(fd, req);
}
}
}
Usage in worker thread:
fd, req = queue->pop_task();
resp = proc(req);
queue->add_result(fd, resp);
C++11 has std::mutex and std::condition_variable. The two can be used to have one thread signal another when a certain condition is met. It sounds to me like you will need to build your solution out of these primitives. If you environment does not yet support these C++11 library features, you can find very similar ones at boost. Sorry, can't say much about python.
One way to accomplish what you're looking to do is by implementing the Observer Pattern
You would register your main thread as an observer with all your spawned threads, and have them notify it when they were done doing what they were supposed to (or updating during their run with the info you need).
Basically, you want to change your approach to an event-driven model.