Catching an exception thrown from a callback in cudaLaunchHostFunc

Catching an exception thrown from a callback in cudaLaunchHostFunc - c++

I want to check for an error flag living in managed memory that might have been written by a kernel running on a certain stream. Depending on the error flag I need to throw an exception.
I would simply sync this stream and check the flag from the host, but I need to do so from inside a CUDA graph. AFAIK I need to somehow encode this host-side error checking inside a cudaLaunchHostFunc callback.
I am trying to understand how the cudaLaunchHostFunc function deals with exceptions. The documentation does not mention anything about it.
Is there any way to catch of an exception thrown from inside the function provided to cudaLaunchHostFunc?
Consider the following MWE:
#include<iostream>
#include <stdexcept>
__global__ void kern(){
int id = blockIdx.x*blockDim.x + threadIdx.x;
printf("Kernel\n");
return;
}
void foo(void* data){
std::cerr<<"Callback"<<std::endl;
throw std::runtime_error("Error in callback");
}
void launch(){
cudaStream_t st = 0;
kern<<<1,1,0,st>>>();
cudaHostFn_t fn = foo;
cudaLaunchHostFunc(st, fn, nullptr);
cudaDeviceSynchronize();
}
int main(){
try{
launch();
}
catch(...){
std::cerr<<"Catched exception"<<std::endl;
}
return 0;
}
The output of this code is:
Kernel
Callback
terminate called after throwing an instance of 'std::runtime_error'
what(): Error in callback
Aborted (core dumped)
The exception is thrown but it appears that it is not propagated to the launch function. I would have expected the above launch() function to be equivalent (exception-wise) to the following:
void launch(){
cudaStream_t st = 0;
kern<<<1,1,0,st>>>();
cudaStreamSynchronize(st);
foo(nullptr);
// cudaHostFn_t fn = foo;
// cudaLaunchHostFunc(st, fn, nullptr);
cudaDeviceSynchronize();
}
which does outputs the expected:
Kernel
Callback
Catched exception
Additionally, in the first case, all cuda calls return cudaSuccess.

Thanks to the comments I understand now that my question is essentially the same as, for instance, this one: How can I propagate exceptions between threads?
The techniques used to take exceptions from a worker thread to the main thread also apply here.
For completion, the foo and launch functions in my dummy example could be rewritten as follows
void foo(void* data){
auto e = static_cast<std::exception_ptr*>(data);
std::cerr<<"Callback"<<std::endl;
try{
throw std::runtime_error("Error in callback");
}
catch(...){
*e = std::current_exception();
}
}
void launch(){
cudaStream_t st = 0;
dataD = 0;
kern<<<1,1,0,st>>>();
cudaStreamSynchronize(st);
cudaHostFn_t fn = foo;
std::exception_ptr e;
cudaLaunchHostFunc(st, fn, (void*)&e);
cudaDeviceSynchronize();
if(e) std::rethrow_exception(e);
}
Which prints the expected:
Kernel
Callback
Catched exception

Related

C++ uncaught exception in worker thread

Uncaught exception behaves differently for main thread and another std::thread.
here is the test program
#include <thread>
class XXX{
public:
XXX(){std::fprintf(stderr, "XXX ctor\n");}
~XXX(){std::fprintf(stderr, "XXX dtor\n");}
};
void mytest(int i)
{
XXX xtemp;
throw std::runtime_error("Hello, world!");
}
int main(int argc, char *argv[])
{
if(argc == 1) {
mytest(0);
}else{
std::thread th([&]{mytest(0);});
th.join();
}
}
above code (C++11), compiled by GCC 5.4
run without args
XXX ctor
terminate called after throwing an instance of 'std::runtime_error'
what(): Hello, world!
Aborted (core dumped)
run 1 arg:
XXX ctor
XXX dtor
terminate called after throwing an instance of 'std::runtime_error'
what(): Hello, world!
Aborted (core dumped)
So the stack unwind is performed in worker thread but not in main thread, WHY?
I'm asking because I'd like core dump to give useful stack backtrace information in both cases (for uncaught exception).
Thanks in advance!!!
Further trials reveals that add noexcept keyword to thread function body mytest() can partially solves my problem because unwind will fail,but its not a good solution because unwind will still partially happen if mytest() is calling another function without noexcept guarantee and actually thrown the uncaught exception.
Update: Thanks to all comment providers, now I understand C++ exception is not backtrace friendly, and GCC, as a C++ implementation,has the freedom to choose to not unwind when uncaught exception is thrown from main thread, and unwind when from worker thread.
Update: Special thanks to Sid S & Jive Dadson, I must mixed up some concepts: 1) exception/error handling; 2) runtime assertion 3) Segment fault, 2&3 are similar, they are UN-recoverable errors, abort immediately is the only choice, they are also backtrace friendly in debugger because stack unwind is not involved. they should not be implemented using exception concept at all. exception is supposed to be caught always, let uncaught exception leaving main() is not a recommended usage.

Why? That's just the way it is. Since c++11, there is some support for dealing with exceptions thrown in threads other than main, but you will need to instrument the threads to catch exceptions and re-throw them. Here's how.
#include <thread>
#include <iostream>
class XXX {
public:
XXX() { std::fprintf(stderr, "XXX ctor\n"); }
~XXX() { std::fprintf(stderr, "XXX dtor\n"); }
};
void mytest(int i)
{
XXX xtemp;
throw std::runtime_error("Hello, world!");
}
int main(int argc, char *argv[])
{
std::exception_ptr exception_ptr = nullptr;
if (argc == 1) {
mytest(0);
}
else {
std::thread th([&exception_ptr]() {
try {
mytest(0);
}
catch (...) {
exception_ptr = std::current_exception();
}
});
th.join();
if (exception_ptr) {
try {
std::rethrow_exception(exception_ptr);
}
catch (const std::exception &ex)
{
std::cerr << "Thread exited with exception: " << ex.what() << "\n";
}
}
}
}

You should catch the exception in the thread where it occurs. The default handler will call terminate() wherever it is at, unwinding or not depending on implementation.

Is there a way to get the pointer that was access during the handling of a signal?

Approximately the code is this:
#include <signal.h>
void SegmentationFaultHandler( int signal ) {
if ( signal == SIGSEGV ) {
// how to check here if it's actual null pointer?
Throw( NullPointerException, "Object pointer not set to an instance of an object." );
}
else
Throw( InvalidOperationException, "Signal has been intercepted by wrong function." );
}
int main( ) {
signal( SIGSEGV, SegmentationFaultHandler );
try {
int *i = null;
*i = 0;
...
How can I check if I didn't double check a pointer, or just accessed uninitialized data and deference it?
I know it's possible, because debuggers can know which address the program tried to access.

It appears that you can retrieve PEXCEPTION_POINTERS inside your handler using _pxcptinfoptrs global variable which is declared inside signal.h. Then use this pointer as in below examples.
static void sigsegv_handler(int signo)
{
PEXCEPTION_POINTERS excptr = _pxcptinfoptrs;
if (excptr != NULL) {
}
// ...
}
Vectored Exception Handler
Under windows you can use Vectored Exception Handler. You handler will look as follows:
LONG WINAPI ExceptionFilter( struct _EXCEPTION_POINTERS * pExceptionPointers ) {
then:
pExceptionPointers->ExceptionRecord->ExceptionCode
is you exception code, EXCEPTION_ACCESS_VIOLATION is when you access invalid memory.
pExceptionPointers->ExceptionRecord->ExceptionInformation[0] == 0
is true when read operation was done
pExceptionPointers->ExceptionRecord->ExceptionInformation[0] == 1
is for write operation
pExceptionPointers->ExceptionRecord->ExceptionInformation[1]
is the address which was being read/written when exception happend
Structured Exception filtering
If you cannot use vectored exception handler then you may add __try/__except at the lowest level of your code, ie. in main() or where you thread function is being exectured:
__try {
// Code which might cause access violation or any other hardware exceptions
}
__except(ExceptionFilter(GetExceptionInformation())) {
}

System exception handling on different platforms

Basically, how to catch exceptions on mac/linux? That is, exceptions, that are not intrinsic to the language, like segfaults & integer division. Compiling on MSVC, __try __except is perfect because the stack handling allows to catch exceptions and continue execution lower down the stack.
Now, i would like to extend my program to other platforms (mainly the ones mentioned), but i have no idea how exception handling works on these platforms work. As far as i understand, it's handled through posix signals? And as of such, wont allow to handle exception and continue lower down the stack?
Edit: Would this be valid (pseudo code)? As i see it, i leave C++ blocks correctly and thus dont indulge myself in UB.
jmp_buf buffer;
template< typename func >
protected_code(func f) {
if(!setjmp(buffer) {
f();
}
else
{
throw std::exception("exception happened in f()"):
}
}
void sig_handler() {
longjmp(buffer);
}
int main() {
sigaction(sig_handler);
try {
protected_code( [&]
{
1/0;
}
);
}
catch(const std::exception & e) {
...
}
}
Edit 2:
Wow for some reason i never thought of just throwing a C++ exception from the signal handler, no need to use longjmp/setjmp then. It of course relies on the fact that the thread calling the signal handler is the same stack and thread that faulted. Is this defined/guaranteed somewhere?
Code example:
void sig_handler(int arg) {
throw 4;
}
int main() {
signal(SIGFPE, sig_handler);
try {
int zero = 1;
zero--;
int ret = 1/zero;
} catch(int x) {
printf("catched %d\n", x);
}
return 0;
}

In Unix, you'd catch processor faults with signal handlers, using the sigaction function to install a suitable handler for the signal that you want to handle.
(I think you mean __try ... __except rather than __try ... __catch.

try/catch to avoid .stackdump

In the code below i use try/catch in the python module code. In the try block i have a simple error (memory access violation) and trying to catch the corresponding exception and to terminate the program quietly without generation of the .stackdump file. However the latter is still generated what implies that try/catch construct does not do its job. How could i avoid generating .stackdump file and exit the program without errors when the improper operation (like one in the code) is met?
P.S. i'm compiling the code in cygwin with gcc and boost.python
It is interesting that it doesn't work only in case x[3]=2, but works for all other cases: e.g. x[4]=2 or x[20]=2 or, obviously, x[2]=2.
#include <boost/python.hpp>
#include <iostream>
#include <iomanip>
using namespace std;
using namespace boost::python;
class Hello
{
std::string _msg;
public:
Hello(std::string msg){_msg = msg;}
void run(){
try{
double* x;
x = new double[3];
x[3] = 2.0;
delete [] x;
}catch(...){ exit(0); }
}
};
BOOST_PYTHON_MODULE(xyz)
{
class_<Hello>("Hello", init<std::string>())
.def("run",&Hello::run)
;
}
EDIT:
According to what Maciek has suggested i tried the following trick:
Make signal handling function to throw an exception, but not exit
void sig_action(int signo) {
std::cout << "SIGNAL " << signo << std::endl;
throw 1;
// exit(0);
}
And now try to enclose a possibly problematic function in try/catch block (signal function is placed in class constructor):
class Hello
{
std::string _msg;
public:
Hello(std::string msg){
_msg = msg;
signal(SIGABRT, sig_action);
signal(SIGSEGV, sig_action);
}
void set(std::string msg) { this->_msg = msg; }
std::string greet() { return _msg; }
void run(){
try{
double* x;
x = new double[3];
x[3] = 2.0;
delete [] x;
}catch(...){ cout<<"error in function run()\n"; exit(0); }
}
};
However such a trick doesn't work as i expected it produces the following output:
SIGNAL 6
terminate called after throwing an instance of 'int'
SIGNAL 6
terminate called recursively
SIGNAL 6
terminate called recursively
....
(and many more times the same)
So the exception is thrown, but everything finishes before it has been caught. Is there any way to let it be caught before terminating the process?

You can only catch exceptions that are thrown. An invalid pointer access doesn’t throw an exception, it simply causes undefined behaviour, and in your particular case it results in a stack dump.
If you want to catch such a situation situation, use std::vector and the at function to access items. This will throw std::out_of_range when used with an invalid index. However, it’s usually better to avoid the possibility of such accesses a priori since they are usually indicative of a bug in your program, and bugs should not be handled via exceptions, they should be removed from the code.

On linux core dumps are generated by signal handlers with default action set to core (SIGABRT, SIGSEGV, ...). If you want to avoid core dump you can always capture/ignore those signals. It should work on Cygwin stackdumps as well. But you will still probably get some nasty message as output.
EDIT:
#include <signal.h>
// [...]
void sig_action(int signo) {
std::cout << "SIGNAL " << signo << std::endl;
exit(0);
}
int main(int argc, char* argv[]) {
signal(SIGABRT, sig_action);
signal(SIGSEGV, sig_action);
Hello h("msg");
h.run();
}

why does throw "nothing" causes program termination?

const int MIN_NUMBER = 4;
class Temp
{
public:
Temp(int x) : X(x)
{
}
bool getX() const
{
try
{
if( X < MIN_NUMBER)
{
//By mistake throwing any specific exception was missed out
//Program terminated here
throw ;
}
}
catch (bool bTemp)
{
cout<<"catch(bool) exception";
}
catch(...)
{
cout<<"catch... exception";
}
return X;
}
private:
int X;
};
int main(int argc, char* argv[])
{
Temp *pTemp = NULL;
try
{
pTemp = new Temp(3);
int nX = pTemp->getX();
delete pTemp;
}
catch(...)
{
cout<<"cought exception";
}
cout<<"success";
return 0;
}
In above code, throw false was intended in getX() method but due to a human error(!) false was missed out. The innocent looking code crashed the application.
My question is why does program gets terminated when we throw "nothing”?
I have little understanding that throw; is basically "rethrow" and must be used in exception handler (catch). Using this concept in any other place would results into program termination then why does compiler not raise flags during compilation?

This is expected behaviour. From the C++ standard:
If no exception is presently being
handled, executing a throw-expression
with no operand calls
terminate()(15.5.1).
As to why the compiler can't diagnose this, it would take some pretty sophisticated flow analysis to do so and I guess the compiler writers would not judge it as cost-effective. C++ (and other languages) are full of possible errors that could in theory be caught by the compiler but in practice are not.

To elaborate on Neil's answer:
throw; by itself will attempt to re-raise the current exception being unwind -- if multiple are being unwound, it attempts to rethrow the most recent one. If none are being unwound, then terminate() is called to signal your program did something bogus.
As to your next question, why the compiler doesn't warn with throw; outside a catch block, is that the compiler can't tell at compile-time whether the throw; line may be executing in the context of a catch block. Consider:
// you can try executing this code on [http://codepad.org/pZv9VgiX][1]
#include <iostream>
using namespace std;
void f() {
throw 1;
}
void g() {
// will look at int and char exceptions
try {
throw;
} catch (int xyz){
cout << "caught int " << xyz << "\n";
} catch (char xyz){
cout << "caught char " << xyz << "\n";
}
}
void h() {
try {
f();
} catch (...) {
// use g as a common exception filter
g();
}
}
int main(){
try {
h();
} catch (...) {
cout << "some other exception.\n";
}
}
In this program, g() operates as an exception filter, and can be used from h() and any other function that could use this exception handling behavior. You can even imagine more complicated cases:
void attempt_recovery() {
try{
// do stuff
return;
} catch (...) {}
// throw original exception cause
throw;
}
void do_something() {
for(;;) {
try {
// do stuff
} catch (...) {
attempt_recovery();
}
}
}
Here, if an exception occurs in do_something, the recovery code will be invoked. If that recovery code succeeds, the original exception is forgotten and the task is re-attempted. If the recovery code fails, that failure is ignored and the previous failure is re-throw. This works because the throw; in attempt_recovery is invoked in the context of do_something's catch block.

From the C++ standard:
15.1 Throwing an exception
...
If no exception is presently being
handled, executing a throw-exception
with no operand calls terminate()
The reason the compiler can't reliably catch this type of error is that exception handlers can call functions/methods, so there's no way for the compiler to know whether the throw is occurring inside a catch. That's essentially a runtime thing.

I have little understanding that throw; is basically "rethrow" and must be used in exception handler (catch). Using this concept in any other place would results into program termination then why does compiler not raise flags during compilation?
Rethrowing is useful. Suppose you have a call stack three levels deep with each level adding some context resource object for the final call. Now, when you have an exception at the leaf level, you will expect some cleanup operation for whatever resources the object has created. But this is not all, the callers above the leaf may also have allocated some resources which will need to be deallocated. How do you do that? You rethrow.
However, what you have is not rethrow. It is a signal of giving up after some failed attempts to catch and process any and all exceptions that were raised.

A throw inside of a catch block with no args will re-throw the same exception that was caught, so it will be caught at a higher level.
A throw outside of a catch block with no args will cause a program termination.

To complete the previous answers with an example of when/why the compiler cannot detect the problem:
// Centralized exception processing (if it makes sense)
void processException()
{
try {
throw;
}
catch ( std::exception const & e )
{
std::cout << "Caught std::exception: " << e.what() << std::endl;
}
catch ( ... )
{
std::cout << "Caught unknown exception" << std::endl;
}
}
int main()
{
try
{
throw 1;
}
catch (...)
{
processException(); // correct, still in the catch clause
}
processException(); // terminate() no alive exception at the time of throw.
}
When compiling the function processException the compiler cannot know how and when it will be called.

You don't have anything to catch, and so the exception bubbles all the way up. Even catch(...) needs something.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Catching an exception thrown from a callback in cudaLaunchHostFunc - c++

Related

C++ uncaught exception in worker thread

Is there a way to get the pointer that was access during the handling of a signal?

System exception handling on different platforms

try/catch to avoid .stackdump

why does throw "nothing" causes program termination?

Categories

Resources