C++ printing in signal handlers?

C++ printing in signal handlers? - c++

I searched a lot but none answered my question, I read that it's not safe to use cout in signal handlers like this:
void ctrlZHandler(int sig_num) {
//SIGTSTP-18
std::cout << "smash: got ctrl-Z" << std::endl;
SmallShell::route_signal(sig_num);
}
will it solve the problem if I move the printing inside route_signal?
Is there a lists of safe-to-call functions in C++11?
What if the only solution to use write, can you show me short example, and let's say route_signal have 100 printings should I replace all with write()? that sounds exhausting with the need to allocate memory and free...

The reason why using std::cout inside signal handlers isn't recommented is because signals might interrupt your running code whenever and std::cout::operator << is not reentrant.
This means if you are executing std::cout::operator << when a signal is raised that also uses it within it's execution, the result is undefined.
So, no. Moving it into route_signal would not solve this and you should replace every call of std::cout within!
One workaround would be to set a flag that this signal was received and create a output outside the signal handler after it returned.

Signal handlers need to run quickly and be reentrant, which is why they shouldn’t call output stream functions like cout <<, either directly or indirectly.
If you are doing this temporarily under controlled conditions for testing, it might be okay, but make sure the signal you are handling is not triggered again until the handler has finished and be aware that stream functions can be slow, which might mess up your tests as well.

will it solve the problem if I move the printing inside route_signal?
No.
Is there a lists of safe-to-call functions in C++11?
For practical purposes, the only safe thing you can do is set a volatile sig_atomic_t or lock-free atomic flag inside a signal handler. (N3690 intro.execution §1.9 ¶6)
I'm no C nor C++ language lawyer, but I believe anything permitted in a conforming C application is allowed in a C++11 signal handler. However, that set is very, very limited: abort, quick_exit, _Exit, and signal. (ISO/IEC 9899:2011 §7.14.1.1 ¶5).
What if the only solution to use write, can you show me short example, and let's say route_signal have 100 printings should I replace all with write()? that sounds exhausting with the need to allocate memory and free...
A better solution is to redesign your program to use sigwait or to check that a flag safely set inside the signal handler.
If you insist on using write, and if you trust that it is safe to call inside a signal handler in your C++ implementation — which it probably is but, again, is not guaranteed by C++ itself — then you simply have a coding problem. You'll need to figure out formatting yourself, bearing in mind that even on POSIX-conforming systems malloc and free are not async-signal-safe. It can certainly be done.

Related

Is read-only access to a vector (vector::operator[] and vector::size()) asynchronous-safe?

My program needs to perform read-only access to the contents of a vector<string> in a signal handler for SIGINT. (The alternative is to use a fixed-size array of fixed-length C strings.) The program is designed to run in a POSIX environment.
Are vector::operator[] and vector::size() asynchronous-safe (or signal-safe)?

No, it's not safe. C++11 1.9/6:
When the processing of the abstract machine is interrupted by receipt of a signal, the values of objects which
are neither
of type volatile std::sig_atomic_t nor
lock-free atomic objects (29.4)
are unspecified during the execution of the signal handler, and the value of any object not in either of these
two categories that is modified by the handler becomes undefined.

Angew's answer is correct considering C++. Now that the question mentions POSIX environment, which could provide stronger guarantees, this needs another answer, which is:
If the process is multi-threaded, or if the process is single-threaded and a signal handler is executed other than as the result of:
The process calling abort(), raise(), kill(), pthread_kill(), or sigqueue() to generate a signal that is not blocked
A pending signal being unblocked and being delivered before the call that unblocked it returns
the behavior is undefined if the signal handler refers to any object other than errno with static storage duration other than by assigning a value to an object declared as volatile sig_atomic_t, or if the signal handler calls any function defined in this standard other than one of the functions listed in the following table.
Source: The Open Group Base Specifications Issue 7
IEEE Std 1003.1, 2013 Edition, 2.4.3
This is... still a very weak guarantee. As far as I can understand this:
vector::operator[] is not safe. Fixed arrays are not safe. Access to fixed arrays is safe if the array is non-static.
Why? vector::operator[] doesn't specify exactly how it should be implemented, only the preconditions and postconditions. The access to elements of an array is possible (if the array is non-static), this implies that the access to vector elements is also safe if you create a pointer (with vec.data() or &vec[0]) before signalling, and then accessing the elements through the pointer.
EDIT: Originally I missed that because I wasn't aware of the sigaction function - with signal you could only access your local arrays in the signal handler, but with sigaction you can provide pointers to automatic and dynamically arrays. The advice with doing as little as possible in signal handlers still applies here though.
Bottom line: You're doing too much in your signal handlers. Try doing as little as possible. One approach is to assign to a flag (of type volatile sig_atomic_t), and return. The code can later check if the flag was triggered (e.g. in an event loop)

I believe that if you know the reason that access to a vector is not safe then you can work around it. Note that access still isn't guaranteed safe. But it will work on anything that isn't a Death Station 9000.
A signal handler interrupts the execution of the program much like a interrupt handler would if you were programming directly to the hardware. The operating system simply stops executing your program, wherever it happens to be. This might be in the middle of anything. For example, if your vector has elements being added to it and it is updating its size value or it is copying the contents to a new, longer vector, that might be interrupted by the signal. And then your signal handler would try to read from the vector resulting in disaster.
You can access the vector from the signal handler as long as it is effectively constant. If you set up the whole thing at program start and then never write to it again, it is safe to use. Note, not safe to use according to the standards documents, but effectively safe.
This is a lot like multi-threading on a single-core CPU.
Now, if you do need to update the vector while the program is running you need to "lock" the signal handler away from it by masking the signal or disabling the handler before updating the vector, to ensure that the handler won't run while the vector is in an inconsistent state.

Signal handler and local state

I'm working in C++ on Unix.
Say I have a long running function that does something, for example read stuff from file and parse it. In this function I keep count of the things I read from the file in a local variable num_read.
I want to catch CTRL+c in a custom signal handler and print the value of num_read.
The only way I can think of is allocating num_read on the heap and storing its adress in a global variable that can be accessed by my signal handler. Is there a more elegant way?

The answer is no. There is no way of communicating between
a signal handler and the rest of code except by global
variables.
Also, you can only do a very, very limited number of things in
a signal handler. You cannot use a << on an std::ostream,
for example, nor can you call printf. The usual way of
handling signals under Unix is to catch them in a separate
thread. The alternative (which works for other OS's as well) is
to define a global variable of sig_atomic_t, which is set in
the signal handler, and polled in the main loop. (In your case,
for example, you might poll it every time you update
num_read.)

Except the traditional Unix way with signal handlers, there is other:
since Linux kernel 2.6.22 there is signalfd() function present. You may obtain a usual file descriptor and poll it (using select or epoll) for incoming signals. So when you handle a signal there is no any limitations proper to them -- it's just usual userspace code, so you can call whatever you want...
as far as I know for OS X, there is similar feature present in kqueue (search this site or internet for EVFILT_SIGNAL and kqueue)

cancel a c++ 11 async task

How can I stop/cancel an asynchronous task created with std::async and policy std::launch::async? In other words, I have started a task running on another thread, using future object. Is there a way to cancel or stop the running task?

In short no.
Longer explanation: There is no safe way to cancel any threads in standard C++. This would require thread cancellation. This feature has been discussed many times during the C++11 standardisation and the general consensus is that there is no safe way to do so. To my knowledge there were three main considered ways to do thread cancellation in C++.
Abort the thread. This would be rather like an emergency stop. Unfortunately it would result in no stack unwinding or destructors called. The thread could have been in any state so possibly holding mutexes, having heap allocated data which would be leaked, etc. This was clearly never going to be considered for long since it would make the entire program undefined. If you want to do this yourself however just use native_handle to do it. It will however be non-portable.
Compulsory cancellation/interruption points. When a thread cancel is requested it internally sets some variable so that next time any of a predefined set of interruption points is called (such as sleep, wait, etc) it will throw some exception. This would cause the stack to unwind and cleanup can be done. Unfortunately this type of system makes it very difficult make any code exception safe since most multithreaded code can then suddenly throw. This is the model that boost.thread uses. It uses disable_interruption to work around some of the problems but it is still exceedingly difficult to get right for anything but the simplest of cases. Boost.thread uses this model but it has always been considered risky and understandably it was not accepted into the standard along with the rest.
Voluntary cancellation/interruption points. ultimately this boils down to checking some condition yourself when you want to and if appropriate exiting the thread yourself in a controlled fashion. I vaguely recall some talk about adding some library features to help with this but it was never agreed upon.
I would just use a variation of 3. If you are using lambdas for instance it would be quite easy to reference an atomic "cancel" variable which you can check from time to time.

In C++11 (I think) there is no standard way to cancel a thread. If you get std::thread::native_handle(), you can do something with it but that's not portable.

maybe you can do like this way by checking some condition:
class Timer{
public:
Timer():timer_destory(false){}
~Timer(){
timer_destory=true;
for(auto result:async_result){
result.get();
}
}
int register_event(){
async_result.push_back(
std::async(std::launch::async,[](std::atomic<bool>& timer_destory){
while(!timer_destory){
//do something
}
},std::ref(timer_destory))
);
}
private:
std::vector<std::future<int>> async_result;
std::atomic<bool> timer_destory;
}

signal vs thread

I am looking some info about reentrancy, then I encountered about signal and thread. What is the difference between the two?
Please advice.
Many thanks.

You are comparing apples and oranges. Signal Programming is event driven programming and can be used to influence threads. However the signal programming paradigm can be used in a single threaded application.

To understand signals it is best to start by thinking about a single threaded program. This program is doing whatever it does with its one thread and then a signal is delivered to it. If the program has registered a signal handler (a function to call) for that signal then the normal execution of that program will be put on hold for a little bit while the signal handler function is called (very much like an hardware interrupt interrupts the operating system to run interrupt service routines) and run the function that the program has registered to handle that signal. So with the code:
#include <stdio.h>
#include <signal.h>
#include <unistd.h> // for alarm
volatile int x = 0;
void foo(int sig_num) {
x = sig_num;
}
int main(void) {
unsigned long long count = 0;
signal(SIGALRM, foo);
alarm(1); // This is a posix function and may not be in all hosted
// C implementations.
// SIGALRM will be sent to this process in 1 second.
while (!x) {
printf("not x\n");
count++;
}
printf("x is %i and count = %llu\n", x, count);
}
The program will loop until someone sends it a signal (how this happens may differ by platform). If the signal SIGALARM is sent then foo will set x and the loop will exit. Exactly where in the loop foo is called is not clear. It could happen just between the print and incrementing the count, just after the while conditional is tested, during the print, ... lots of places, really. This is why signals may pose a concurrency or reentrantcy problem -- they can change things without the other code knowing that it happened.
The reason that x was declared as volatile was that without that many compilers might think "hey, no one in main changes x and main doesn't call any other functions, so x never changes" and optimize out the loop test. Specifying volatile tells the C compiler that this variable can be changed by unseen forces (such as signal handlers, other threads, or sometimes even hardware in the case of memory mapped device control registers).
It was pretty easy to make sure that x was looked out for properly between both the signal handler and the main execution code because x is just an integer (load and store for it were probably single instructions assembly each), it was only altered by the one thing (the signal handler, rather than the main code) in this case, and it was only used as a simple boolean value. If x were some other type, such as a string, then since signals can interrupt you at any time, the signal handler may overwrite part of the string while the main code was in the middle of reading the string. This could have results as bad as someone freezing time while you were in the middle of brushing your teeth, replacing your toothbrush with a cobra, and then unfreezing time.
A little bit more on signals -- they are part of the C language, but most of their use is not really covered by C. Many of the Linux, Unix, and POSIX functions that have to do with signals are not part of the C language, but it is difficult to come up with reasonable (and small) examples of signal use that doesn't rely on something not in the C standard, which is why I used the alarm function. The raise function, which is part of C, can be used to send a signal to yourself, but it is more difficult to make examples for.
As scary as signals may seem now, most systems have more functions that make them much more easy to use.
threads, finally
Threads execute concurrently, while signals interrupt. While there are some threading libraries that actually implement threading in such a way that this is not really the case, it is best to think of threads this way. Since computer programs are actually very limited in their ability to see what is going on threads can get in each others' way just like signal handlers can get in the way of the main execution code (probably more often than signal handlers, though).
Imagine that you are about to brush your teeth again, but this time you are def and blind. Now your roommate, who is also def and blind, comes in to fix the sink with some silicone sealer. Just as you reach for the toothpaste he lays down the tube of silicone right on top of the tube of toothpaste and you grab the tube of silicone instead of the toothpaste. Remember, since you are both blind and def (and somehow not bumping into each other) you both assume that no one else is using the sink, so you never realize that you have just put the silicone on your toothbrush, and your roommate doesn't realize that he is trying to fill the cracks between the tile and the back of the sink with toothpaste.
Luckily there are ways that threads can communicate to each other that something is currently in use so other threads should stay away (like locking the door while you brush your teeth).

Thread lives inside a process whereas signals are part of a universe, and signals have permission to communicate with processes or with specific thread inside a process.

Basic signal handling in C++

This is a pretty basic scenario but I'm not finding too many helpful resources. I have a C++ program running in Linux that does file processing. Reads lines, does various transformations, writes data into a database. There's certain variables (stored in the database) that affect the processing which I'm currently reading at every iteration because I want processing to be as up to date as possible, but a slight lag is OK. But those variables change pretty rarely, and the reads are expensive over time (10 million plus rows a day). I could space out the reads to every n iterations or simply restart the program when a variable changes, but those seem hackish.
What I would like to do instead is have the program trigger a reread of the variables when it receives a SIGHUP. Everything I'm reading about signal handling is talking about the C signal library which I'm not sure how to tie in to my program's classes. The Boost signal libraries seem to be more about inter-object communication rather than handling OS signals.
Can anybody help? It seems like this should be incredibly simple, but I'm pretty rusty with C++.

I would handle it just like you might handle it in C. I think it's perfectly fine to have a stand-alone signal handler function, since you'll just be posting to a semaphore or setting a variable or some such, which another thread or object can inspect to determine if it needs to re-read the settings.
#include <signal.h>
#include <stdio.h>
/* or you might use a semaphore to notify a waiting thread */
static volatile sig_atomic_t sig_caught = 0;
void handle_sighup(int signum)
{
/* in case we registered this handler for multiple signals */
if (signum == SIGHUP) {
sig_caught = 1;
}
}
int main(int argc, char* argv[])
{
/* you may also prefer sigaction() instead of signal() */
signal(SIGHUP, handle_sighup);
while(1) {
if (sig_caught) {
sig_caught = 0;
printf("caught a SIGHUP. I should re-read settings.\n");
}
}
return 0;
}
You can test sending a SIGHUP by using kill -1 `pidof yourapp`.

I'd recommend checking out this link which gives the details on registering a signal.
Unless I'm mistaken, one important thing to remember is that any function inside an object expects a referent parameter, which means non-static member functions can't be signal handlers. I believe you'll need to register it either to a static member function, or some kind of global function. From there, if you have a specific object function you want to take care of your update, you'll need a way to reference that object.

There are several possibilities; it would not necessarily be overkill to implement all of them:
Respond to a specific signal, just like C does. C++ works the same way. See the documentation for signal().
Trigger on the modification timestamp of some file changing, like the database if it is stored in a flat file.
Trigger once per hour, or once per day (whatever makes sense).

You can define a Boost signal corresponding to the OS signal and tie the Boost signal to your slot to invoke the respective handler.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js