c++ watchdog for 3rd party lib calls - c++

I have a problem with long running boost::regex_match(...) invocation in a threaded process environment. But it could be another lib (API call) having the same problem.
Is there a generic way to set up a watchdog for such?
For non-threaded process alarm() can be used to detect timeout.
However, signals don't play nicely with threads. I can avoid direct use of alarm() in the thread and delegate timer mgt. to a dedicated separate thread and let that one use pthread_kill(...) to address the correct threads (this is just an idea - i didn't yet verify that part).
However, also this only interrupts and detects the situation, but cannot gracefully stop boost::regex_match(...).
I played around with Throwing an exception from within a signal handler using sigsetjmp() and siglongjmp() for the thread using boost::regex_match(..).
But it causes memory leaks in boost::regex_match(...) becausesiglongjmp()` bypasses destructors.
How can i gracefully stop a 3rd party API call - presuming that it's implemented exception safe?
Or does it have to be supported by some "stoppable" feature actively implemented in the 3rd party API? (is there some for the boost library?)
Maybe some strange idea, but:
Code can be implemented to be "thread-safe" and/or "exception-safe".
Would it be an option to define "longjmp-safe"? This could be done by passing an additional token to a lib to let is associate all resource allocations to that token. After longjmp() the client SW could ask the API separately to release those resources.
simpler maybe would just be some central init()/release() or register()/unregister() API call, by which the API could clean-up itself.

In a case where you have to:
monitor exceeding execution time
stop execution of processing
you should simply think for tasks instead of threads.
Using threads is something which sounds like "state of the art" but in practice tasks are very often the better way of implementation. Especially for controlling memory leeks in "undefined" end of execution, confine unwanted memory excess and control stack overruns etc.
In the case you have mentioned I tend to implement that as tasks. IPC works well on all known platforms but is not portable. If portability is no problem, changing to a task based solution is not a big deal.
A hanging task can be killed by a os call and all locks, memory and other resources like ipc/shared memory/pipes etc. will be removed automatically. So this fits much better to your problem and it did not depend on your external and maybe unchangeable third party components.

Related

Cancelling arbitary jobs running in a thread_pool

Is there a way for a thread-pool to cancel a task underway? Better yet, is there a safe alternative for on-demand cancelling opaque function calls in thread_pools?
Killing the entire process is a bad idea and using native handle to perform pthread_cancel or similar API is a last resort only.
Extra
Bonus if the cancellation is immediate, but it's acceptable if the cancellation has some time constraint 'guarantees' (say cancellation within 0.1 execution seconds of the thread in question for example)
More details
I am not restricted to using Boost.Thread.thread_pool or any specific library. The only limitation is compatibility with C++14, and ability to work on at least BSD and Linux based OS.
The tasks are usually data-processing related, pre-compiled and loaded dynamically using C-API (extern "C") and thus are opaque entities. The aim is to perform compute intensive tasks with an option to cancel them when the user sends interrupts.
While launching, the thread_id for a specific task is known, and thus some API can be sued to find more details if required.
Disclaimer
I know using native thread handles to cancel/exit threads is not recommended and is a sign of bad design. I also can't modify the functions using boost::this_thread::interrupt_point, but can wrap them in lambdas/other constructs if that helps. I feel like this is a rock and hard place situation, so alternate suggestions are welcome, but they need to be minimally intrusive in existing functionality, and can be dramatic in their scope for the feature-set being discussed.
EDIT:
Clarification
I guess this should have gone in the 'More Details' section, but I want it to remain separate to show that existing 2 answers are based o limited information. After reading the answers, I went back to the drawing board and came up with the following "constraints" since the question I posed was overly generic. If I should post a new question, please let me know.
My interface promises a "const" input (functional programming style non-mutable input) by using mutexes/copy-by-value as needed and passing by const& (and expecting thread to behave well).
I also mis-used the term "arbitrary" since the jobs aren't arbitrary (empirically speaking) and have the following constraints:
some which download from "internet" already use a "condition variable"
not violate const correctness
can spawn other threads, but they must not outlast the parent
can use mutex, but those can't exist outside the function body
output is via atomic<shared_ptr> passed as argument
pure functions (no shared state with outside) **
** can be lambda binding a functor, in which case the function needs to makes sure it's data structures aren't corrupted (which is the case as usually, the state is a 1 or 2 atomic<inbuilt-type>). Usually the internal state is queried from an external db (similar architecture like cookie + web-server, and the tab/browser can be closed anytime)
These constraints aren't written down as a contract or anything, but rather I generalized based on the "modules" currently in use. The jobs are arbitrary in terms of what they can do: GPU/CPU/internet all are fair play.
It is infeasible to insert a periodic check because of heavy library usage. The libraries (not owned by us) haven't been designed to periodically check a condition variable since it'd incur a performance penalty for the general case and rewriting the libraries is not possible.
Is there a way for a thread-pool to cancel a task underway?
Not at that level of generality, no, and also not if the task running in the thread is implemented natively and arbitrarily in C or C++. You cannot terminate a running task prior to its completion without terminating its whole thread, except with the cooperation of the task.
Better
yet, is there a safe alternative for on-demand cancelling opaque
function calls in thread_pools?
No. The only way to get (approximately) on-demand preemption of a specific thread is to deliver a signal to it (that is is not blocking or ignoring) via pthread_kill(). If such a signal terminates the thread but not the whole process then it does not automatically make any provision for freeing allocated objects or managing the state of mutexes or other synchronization objects. If the signal does not terminate the thread then the interruption can produce surprising and unwanted effects in code not designed to accommodate such signal usage.
Killing the entire process is a bad idea and using native handle to
perform pthread_cancel or similar API is a last resort only.
Note that pthread_cancel() can be blocked by the thread, and that even when not blocked, its effects may be deferred indefinitely. When the effects do occur, they do not necessarily include memory or synchronization-object cleanup. You need the thread to cooperate with its own cancellation to achieve these.
Just what a thread's cooperation with cancellation looks like depends in part on the details of the cancellation mechanism you choose.
Cancelling a non cooperative, not designed to be cancelled component is only possible if that component has limited, constrained, managed interactions with the rest of the system:
the ressources owned by the components should be managed externally (the system knows which component uses what resources)
all accesses should be indirect
the modifications of shared ressources should be safe and reversible until completion
That would allow the system to clean up resource, stop operations, cancel incomplete changes...
None of these properties are cheap; all the properties of threads are the exact opposite of these properties.
Threads only have an implied concept of ownership apparent in the running thread: for a deleted thread, determining what was owned by the thread is not possible.
Threads access shared objects directly. A thread can start modifications of shared objects; after cancellation, such modifications that would be partial, non effective, incoherent if stopped in the middle of an operation.
Cancelled threads could leave locked mutexes around. At least subsequent accesses to these mutexes by other threads trying to access the shared object would deadlock.
Or they might find some data structure in a bad state.
Providing safe cancellation for arbitrary non cooperative threads is not doable even with very large scale changes to thread synchronization objects. Not even by a complete redesign of the thread primitives.
You would have to make thread almost like full processes to be able to do that; but it wouldn't be called a thread then!

Two threads using the same websock handle- does it cause any issue?

We are having a C++ application to send and receive WebSocket messages
One thread to send the message (using WinHttpWebSocketSend)
the second thread to receive (using WinHttpWebSocketReceive)
But the same WebSocket handle is used across these 2 threads. Will it cause any problems? I don't know if we have to handle it another way. It works in our application - we are able to send and receive messages - but I don't know if it will have any problem in the production environment. Any one has better ideas?
Like most platforms, nearly all Windows API system calls do not provide thread barriers beyond preventing simultaneous access to the key parts of the kernel. While I could not say for sure (the documentation doesn't seem to answer your explicit question) I would be surprised if the WinHTTP API provides barriers that prevent multiple threads from stepping on each other (so to speak)--particularly because it's really just a "helper" API that uses the somewhat lower level Winsock stuff directly--and I would take it upon myself to implement the necessary barriers.
I'm also wondering why you're using threads in this manner to begin with. I know essentially nothing about the WinHTTP API, but I did notice WINHTTP_OPTION_ASSURED_NON_BLOCKING_CALLBACKS which leads me to believe that you can implement an asynchronous approach which would prevent any thread-safety issues to begin with (and probably be much faster and memory efficient).
It appears that the callback mechanism for WinHTTP is rather expressive. See WINHTTP_STATUS_CALLBACK. Presumably, you can simply use non-blocking operation, create an event listener, and associate the connection handle with dwContext. No threads involved.

C++ graceful shutdown best practices

I'm writing a multi-threaded c++ application for *nix operating systems. What are some best practices for terminating such an application gracefully? My instinct is that I'd want to install a signal handler on SIGINT (SIGTERM?) which stops/joins my threads. Also, is it possible to "guarantee" that all destructors are called (provided no other errors or exceptions are thrown while handling the signal)?
Some considerations come to mind:
designate 1 thread to be responsible for orchestrating the shutdown, eg, as Dithermaster suggested, this could be the main thread if you are writing a standalone application. Or if you are writing a library, provide an interface (eg function call) whereby a client program can terminate the objects created within the library.
you cannot guarantee destructors are called; that is up to you, and requires carefully calling delete for each new. Maybe smart pointers will help you. But, really, this is a design consideration. The major components should have start & stop semantics, which you could choose to invoke from the class constructor & destructor.
the shutdown sequence for a set of interacting objects is something that can require some effort to get correct. E.g., before you delete an object, are you sure some timer mechanism is not going to try calling it in few micro/milli/seconds later? Trial and error is your friend here; develop a framework which can repeatedly & rapidly start and stop your application to tease out shutdown related race-conditions.
signals are one way to trigger an event; others might be periodically polling for a known file, or opening a socket and receiving some data on it. Either way, you want to decouple the shutdown sequence code from the trigger event.
My recommendation is that the main thread shut down all worker threads before exiting itself. Send each worker an event telling it to clean up and exit, and wait for each one to do so. This will allow all C++ destructors to run.
Regarding signal management, the only thing you can portably and safely do inside a signal handler is to write to a variable of type sig_atomic_t (possibly volatile-qualified) and return. In general, you cannot call most functions and must not write to global memory. In other words, the handler should just set a flag to be tested inside your main routine, at some point you find appropriate, and the action resulting from the signal itself should be performed from there.
(Since there might be blocking I/O involved, consider studying POSIX Thread Cancellation. Your Unix clone (most notably Linux) might have peculiarities with respect to this and to the above.)
Regarding destructors, no magic is involved. They will be executed if control leaves a given scope through any means defined in the language. Leaving a scope through other means (for example, longjmp() or even exit()) does not trigger destructors.
Regarding general shutdown practices, there are divergent opinions on the field.
Some state that a "graceful termination", in the sense of releasing every resource ever allocated, should be performed. In C++, this usually means that all destructors should be properly executed before the process terminates. This is tricky in practice and often a source of much grief, specially in multithreaded programs, for a variety of reasons. Signals further complicate things by the very nature of asynchronous signal dispatching.
Because most of this work is totally useless, some others, like me, contend that the program must just terminate immediately, possibly shortly after undoing persistent changes to the system (like removing temporary files or restoring the screen resolution) and saving configuration. An apparently tidier cleanup is not only a waste of time (because the operating system will clean up most things like allocated memory, dangling threads and open file descriptors), but might be a serious waste of time (deallocators might touch paged out memory, uselessly forcing the system to page them in just for releasing them soon after the process terminates, for example), not mentioning the possibility of deadlocks being originated from joining threads.
Just say no. When you want to leave, call exit() (or even _exit(), but watch out for unflushed I/O) and that's it. More annoying than slow starting programs are slow terminating programs.

Controlled application shut-down strategy

Our (Windows native C++) app is composed of threaded objects and managers. It is pretty well written, with a design that sees Manager objects controlling the lifecycle of their minions. Various objects dispatch and receive events; some events come from Windows, some are home-grown.
In general, we have to be very aware of thread interoperability so we use hand-rolled synchronization techniques using Win32 critical sections, semaphores and the like. However, occasionally we suffer thread deadlock during shut-down due to things like event handler re-entrancy.
Now I wonder if there is a decent app shut-down strategy we could implement to make this easier to develop for - something like every object registering for a shutdown event from a central controller and changing its execution behaviour accordingly? Is this too naive or brittle?
I would prefer strategies that don't stipulate rewriting the entire app to use Microsoft's Parallel Patterns Library or similar. ;-)
Thanks.
EDIT:
I guess I am asking for an approach to controlling object life cycles in a complex app where many threads and events are firing all the time. Giovanni's suggestion is the obvious one (hand-roll our own), but I am convinced there must be various off-the-shelf strategies or frameworks, for cleanly shutting down active objects in the correct order. For example, if you want to base your C++ app on an IoC paradigm you might use PocoCapsule instead of trying to develop your own container. Is there something similar for controlling object lifecycles in an app?
This seems like a special case of the more general question, "how do I avoid deadlocks in my multithreaded application?"
And the answer to that is, as always: make sure that any time your threads have to acquire more than one lock at a time, that they all acquire the locks in the same order, and make sure all threads release their locks in a finite amount of time. This rule applies just as much at shutdown as at any other time. Nothing less is good enough; nothing more is necessary. (See here for a relevant discussion)
As for how to best do this... the best way (if possible) is to simplify your program as much as you can, and avoid holding more than one lock at a time if you can possibly help it.
If you absolutely must hold more than one lock at a time, you must verify your program to be sure that every thread that holds multiple locks locks them in the same order. Programs like helgrind or Intel thread checker can help with this, but it often comes down to simply eyeballing the code until you've proved to yourself that it satisfies this constraint. Also, if you are able to reproduce the deadlocks easily, you can examine (using a debugger) the stack trace of each deadlocked thread, which will show where the deadlocked threads are forever-blocked at, and with that information, you can that start to figure out where the lock-ordering inconsistencies are in your code. Yes, it's a major pain, but I don't think there is any good way around it (other than avoiding holding multiple locks at once). :(
One possible general strategy would be to send an "I am shutting down" event to every manager, which would cause the managers to do one of three things (depending on how long running your event-handlers are, and how much latency you want between the user initiating shutdown, and the app actually exiting).
1) Stop accepting new events, and run the handlers for all events received before the "I am shutting down" event. To avoid deadlocks you may need to accept events that are critical to the completion of other event handlers. These could be signaled by a flag in the event or the type of the event (for example). If you have such events then you should also consider restructuring your code so that those actions are not performed through event handlers (as dependent events would be prone to deadlocks in ordinary operation too.)
2) Stop accepting new events, and discard all events that were received after the event that the handler is currently running. Similar comments about dependent events apply in this case too.
3) Interrupt the currently running event (with a function similar to boost::thread::interrupt()), and run no further events. This requires your handler code to be exception safe (which it should already be, if you care about resource leaks), and to enter interruption points at fairly regular intervals, but it leads to the minimum latency.
Of course you could mix these three strategies together, depending on the particular latency and data corruption requirements of each of your managers.
As a general method, use an atomic boolean to indicate "i am shutting down", then every thread checks this boolean before acquiring each lock, handling each event etc. Can't give a more detailed answer unless you give us a more detailed question.

Using asynchronous method vs thread wait

I have 2 versions of a function which are available in a C++ library which do the same task. One is a synchronous function, and another is of asynchronous type which allows a callback function to be registered.
Which of the below strategies is preferable for giving a better memory and performance optimization?
Call the synchronous function in a worker thread, and use mutex synchronization to wait until I get the result
Do not create a thread, but call the asynchronous version and get the result in callback
I am aware that worker thread creation in option 1 will cause more overhead. I am wanting to know issues related to overhead caused by thread synchronization objects, and how it compares to overhead caused by asynchronous call. Does the asynchronous version of a function internally spin off a thread and use synchronization object, or does it uses some other technique like directly talk to the kernel?
"Profile, don't speculate." (DJB)
The answer to this question depends on too many things, and there is no general answer. The role of the developer is to be able to make these decisions. If you don't know, try the options and measure. In many cases, the difference won't matter and non-performance concerns will dominate.
"Premature optimisation is the root of all evil, say 97% of the time" (DEK)
Update in response to the question edit:
C++ libraries, in general, don't get to use magic to avoid synchronisation primitives. The asynchronous vs. synchronous interfaces are likely to be wrappers around things you would do anyway. Processing must happen in a context, and if completion is to be signalled to another context, a synchronisation primitive will be necessary to do that.
Of course, there might be other considerations. If your C++ library is talking to some piece of hardware that can do processing, things might be different. But you haven't told us about anything like that.
The answer to this question depends on context you haven't given us, including information about the library interface and the structure of your code.
Use asynchronous function because will probably do what you want to do manually with synchronous one but less error prone.
Asynchronous: Will create a thread, do work, when done -> call callback
Synchronous: Create a event to wait for, Create a thread for work, Wait for event, On thread call sync version , transfer result, signal event.
You might consider that threads each have their own environment so they use more memory than a non threaded solution when all other things are equal.
Depending on your threading library there can also be significant overhead to starting and stopping threads.
If you need interprocess synchronization there can also be a lot of pain debugging threaded code.
If you're comfortable writing non threaded code (i.e. you won't burn a lot of time writing and debugging it) then that might be the best choice.