OK, so this may be an odd situation but please bear with me.
I have a Python program which calls up a C++ class via a SWIG interface. I came to a point where I must asynchronously signal (to update a status) the Python code from the C++ library. Originally I had inefficient busy loops which polled a flag. I wanted to replace this by using SIGUSR1.
So, the problem is I discovered that even though these are separate 'threads', they share the same PID. That is the Python and C++ program both report the same PID. I have sent the signal using kill, and the Python code caught it in its handler, however it did not interrupt the code as I expected. I tried two methods of waiting for the signal in the Python code, first py calling Python's signal.pause which I read was supposed to be preempted on the reception of a signal. The other one was a simple time.sleep which was supposed to do basically the same thing - return when the signal comes through.
For some reason this isn't working - my C++ code sends the signal, the Python code receives it and calls the handler, however, the pause/sleep calls never return.
If it is possible to correctly signal the same process, how would you do it?
(and if this is just dumb forgive me and move on)
Signals are not the right tool for the job here. Normally, this would be a job for inter-thread synchronization primitives, such as
the locks from the thread module
the Event objects from the threading module
However, it's not easy to manipulate Python thread locks from C++. So I would use the old-fashioned, but very simple, approach of
a pipe, which the Python thread reads from, and the C++ thread writes exactly one byte to when it wants to wake up the Python.
If you are in the same program I am not sure what you would need signals for in this case. Take a look at the observer pattern. If you have your python event handlers subscribe to events in your C++ library, you can avoid signals all together.
Related
I am implementing a function in library which takes a while (up to a minute). It initialize a device. Now generally any long function should run in its own thread and report to main thread when it completes but I am not sure since this function is in library.
My dilemma is this, even if I implement this in a separate thread, another thread in the application has to wait on it. If so why not let the application run this function in that thread anyways?
I could pass queue or mailbox to the library function but I would prefer a simpler mechanism where the library can be used in VB, VC, C# or other windows platforms.
Alternatively I could pass HWND of the window and the library function can post message to it when it completes instead of signaling any event. That seems like most practical approach if I have to implement the function in its own thread. Is this reasonable?
Currently my function prototype is:
void InitDevice(HANDLE hWait)
When initialization is complete than I signal bWait. This works fine but I am not convinced I should use thread anyways when another secondary thread will have to wait on InitDevice. Should I pass HWNDinstead? That way the message will be posted to the primary thread and it will make better sense with multithreading.
In general, when I write library code, I normally try to stay away from creating threads unless it's really necessary. By creating a thread, you're forcing a particular threading model on the application. Perhaps they wish to use it from a very simplistic command-line tool where a single thread is fine. Or they could use it from a GUI tool where things must be multi-threaded.
So, instead, just give the library user understanding that a function call is a long-term blocking call, some callback mechanism to monitor the progress, and finally a way to immediately halt the operation which could be used by a multi-threaded application.
What you do want to claim is being thread safe. Use mutexes to protect data items if there are other functions they can call to affect the operation of the blocking function.
This question is more for my personal curiosity than anything important. I'm trying to keep all my code compatible with at least Windows and Mac. So far I've learned that I should base my code on POSIX and that's just great but...
Windows doesn't have a sigaction function so signal is used? According to:
What is the difference between sigaction and signal? there are some problems with signal.
The signal() function does not block other signals from arriving while the current handler is executing; sigaction() can block other signals until the current handler returns.
The signal() function resets the signal action back to SIG_DFL (default) for almost all signals. This means that the signal() handler must reinstall itself as its first action. It also opens up a window of vulnerability between the time when the signal is detected and the handler is reinstalled during which if a second instance of the signal arrives, the default behaviour (usually terminate, sometimes with prejudice - aka core dump) occurs.
If two SIGINT's come quickly then the application will terminate with default behavior. Is there any way to fix this behavior? What other implications do these two issues have on a process that, for instance wants to block SIGINT? Are there any other issues that I'm likely to run across while using signal? How do I fix them?
You really don't want to deal with signal()'s at all.
You want "events".
Ideally, you'll find a framework that's portable to all the main environments you wish to target - that would determine your choice of "event" implementation.
Here's an interesting thread that might help:
Game Objects Talking To Each Other
PS:
The main difference between signal() and sigaction() is that sigaction() is "signal()" on steroids - more options, allows SA_RESTART, etc. I'd discourage using either one unless you really, really need to.
I have a boost threadpool which I use to do certain tasks. I also have a Sensor class that has the pure virtual function doWork(int total) = 0;. Whenever it is requested, my main process gets the necessary Sensor pointer and tells the threadpool to run Sensor::doWork(int total).
threadpool->schedule(boost::bind(&Sensor::doWork,this,123456));
I am dynamically loading libraries of type Sensor, thus it is out of my control if someone else has faulty coding which results in SEGFAULTS and such. So is there a way for me to (in my main process) handle any errors thrown by Sensor::doWork(int total), clean up the thread, delete that sensor object and notify the console what and where the error has occurred?
Really the only way to handle a segmentation fault here is to run Sensor::doWork in a completely separate process.
In UNIX, this involves using fork (or some other similar means), running Sensor::doWork in the child process, and then somehow shuttling the results back to the parent process.
I assume similar means are available in Windows.
EDIT: I thought I'd flesh out a bit some of the things you can do.
Solution #1: you can work with processes in the same fashion as you would threads. For example, you could create process pool that sit there in a loop of
Wait for a task to be passed in over a pipe or queue or some similar object
Perform the task
Return the results over a pipe or queue or some similar object
And since you're executing the tasks in the other processes, you're protected against them crashing. The main difficulty with this solution is actually communicating between processes; maybe boost's interprocess library will help with that. I've mainly done this sort of thing in python, which has a standard multiprocessing module that handles this stuff for you.
Solution #2: You could divide your application into "safe" and "risky" portions that run in different processes. The "risky" portion executes the Sensor::doWork methods and anything else you might want to do in that process -- but only work that is acceptable to be spontaneously lost if it crashes. The "safe" portion deals with any precious information that you cannot afford to lose, and monitors the "risky" portion, performing some recovery operations when the child crashes. And, of course, whatever other work you decide you want to do in the safe part.
If you got a SIGSEGV, even if you caught it you have no guarantee about your program state so there's pretty much no way to recover.
If you're working with 3rd party libraries, and they're buggy, and the library maintainer won't fix it (and you don't have the source) then your only recourse is to run the third party library from within a totally separate binary that talks to the main binary by some means. See for example firefox and plugin-container.
You might want to register a function callback to catch SIGSEV. In C this can be done using signal. Be aware, however, there is not much you can do, when the OS sends you a SIGSEV (note that it isn't required to). You don't really know in what state your program is in, I'd guess. If for example the heap got corrupt, new and delete operations may fail, so even a plain simple
std::cout << std::string("hello world") << std::endl;
statement, might not work since memory from the heap needs to be allocated.
Best, Christoph
Preface
I have a multi-threaded application running via Boost.Asio. There is only one boost::asio::io_service for the whole application and all the things are done inside it by a group of threads. Sometimes it is needed to spawn child processes using fork and exec. When child terminates I need to make waitpid on it to check exit code an to collect zombie. I used recently added boost::asio::signal_set but encountered a problem under ancient systems with linux-2.4.* kernels (that are unfortunately still used by some customers). Under older linux kernels threads are actually a special cases of processes and therefore if a child was spawned by one thread, another thread is unable to wait for it using waitpid family of system calls. Asio's signal_set posts signal handler to io_service and any thread running this service can run this handler, which is inappropriate for my case. So I decided to handle signals in old good signal/sigaction way - all threads have the same handler that calls waitpid. So there is another problem:
The problem
When signal is caught by handler and process is successfully sigwaited, how can I "post" this to my io_service from the handler? As it seems to me, obvious io_service::post() method is impossible because it can deadlock on io_service internal mutexes if signal comes at wrong time. The only thing that came to my mind is to use some pipe or socketpair to write notifications there and async_wait on another end as it is done sometimes to handle signals in poll() event loops.
Are there any better solutions?
I've not dealt with boost::asio but I have solved a similar problem. I believe my solution works for both LinuxThreads and the newer NPTL threads.
I'm assuming that the reason you want to "post" signals to your *io_service* is to interrupt an system call so the thread/program will exit cleanly. Is this correct? If not maybe you can better describe your end goal.
I tried a lot of different solutions including some which required detecting which type of threads were being used. The thing that finally helped me solve this was the section titled Interruption of System Calls and Library Functions by Signal Handlers of man signal(7).
The key is to use sigaction() in your signal handling thread with out SA_RESTART, to create handlers for all the signals you want to catch, unmask these signals using pthread_sigmask(SIG_UNBLOCK, sig_set, 0) in the signal handling thread and mask the same signal set in all other threads. The handler does not have to do anything. Just having a handler changes the behavior and not setting SA_RESTART allows interruptible systems calls (like write()) to interrupt. Whereas if you use sigwait() system calls in other threads are not interrupted.
In order to easily mask signals in all other threads. I start the signal handling thread. Then mask all the signals in want to handle in the main thread before starting any other threads. Then when other threads are started they copy the main thread's signal mask.
The point is if you do this then you may not need to post signals to your *io_service* because you can just check your system calls for interrupt return codes. I don't know how this works with boost::asio though.
So the end result of all this is that I can catch the signals I want like SIGINT, SIGTERM, SIGHUO and SIGQUIT in order to perform a clean shutdown but my other threads still get their system calls interrupted and can also exit cleanly with out any communication between the signal thread and the rest of the system, with out doing anything dangerous in the signal handler and a single implementation works on both LinuxThreads and NPTL.
Maybe that wasn't the answer you were looking for but I hope it helps.
NOTE: If you want to figure out if the system is running LinuxThreads you can do this by spawning a thread and then comparing it's PID to the main thread's PID. If they differ it's LinuxThreads. You can then choose the best solution for the thread type.
If you are already polling your IO, another possible solution that is very simple is to just use a boolean to signal the other threads. A boolean is always either zero or not so there is no possibility of a partial update and a race condition. You can then just set this boolean flag without any mutexes that the other threads read. Tools like valgrind wont like it but in practice it works.
If you want to be even more correct you can use gcc's atomics but this is compiler specific.
I have an application that allows users to write their own code in a language of our own making that's somewhat like C++. We're getting problems, however, where sometimes our users will accidentally write an infinite loop into their script. Once the script gets into the infinite loop, the only way they can get out is to shut the application down and restart, potentially losing their work. I'd like to add some means where the user, when he realizes that his code is in an infinite loop, can hit a special key, like F10 or something, and the code will break out of the loop. But I'd like to do it without implementing a ton of checks within the script runtime. Optimally, I'd like to have a separate "debugger" thread that's mostly idle, but as one of its tasks it listens for that F10 key, and when it gets the F10 key, it will cause the script runtime thread to throw an exception, so that it will stop executing the script. So my question is, is there a way to have one thread cause another thread to throw an exception? My application is written in C++.
If the script is actually interpreted by your application then you can just tell the interpreter to stop executing whenever some user event occurs.
It's possible. Detect the keystroke in a separate thread, a hidden window and WM_HOTKEY for example. Call SuspendThread() to freeze the interpreter thread. Now use GetThreadContext() to get the CPU registers of the interpreter thread. Modify CONTEXT.Eip to the address of a function and call SetThreadContext(). Have that function call RaiseException() or throw a C++ exception. ResumeThread() and boom.
A short answer - no.
If your application runs on Windows, maybe you can send a message from this "debugger" tread and have a message loop in the main one?
The problem with that solution is, to do a message sending implementation, I'd have to set up a "listener" as part of the script interpreter. Right now, the interpreter just executes the function. The message loop is implemented outside of the interpreter. If within the function there is an infinite loop, then to break out of that script, I'd have to check for a message in between execution of each instruction in the interpreter, i.e. while(more instructions){check F10, execute script instruction}. That seems like a lot of extra unneeded checks that can slow down the script execution. But if that's the only solution, then I guess that's what it has to be. I still think there's got to be a better way. Maybe the script interpreter needs to be run on a child thread, while the main thread continues its message loop, and will then kill the script interpreter thread when it gets an F10.
Whether you code it explicitly or not, you will need to check a "interrupt" variable in the message loop. If you implement this by a simple volatile int, you will have both a very simple test and very little overhead.
It is unsafe to terminate a thread, as it is probably using resources shared across the entire process.
It is less unsafe to terminate an entire process, but that's not going to help you.
A more safe way to deal with this would be to have the interpreter check for events on a regular basis and treat the stop event as a case to terminate (or at least spill out to a higher loop).
For windows, you could also queue an APC to that thread that calls RaiseException(...) or throws an exception, (although I would avoid the latter, since that crosses API boundaries), but that also implies that the thread will put itself into an alertable state. And I don't really recommend it.