I'm faced with the problem that my applications global variable destructors are not called. This seems to occur only if my application successfully connects to an oracle database (using OCI).
I put some breakpoints in the CRT and it seems that DllMain (or __DllMainCRTStartup) is not called with DLL_PROCESS_DETACH, thus no atexit() is called which explains why my destructors are not called.
I have no idea why this happens.
I realize that this is probably not enough information to be able to indicate the cause, but my question is: What would be a good start to look for the cause of this problem?
This is a list of things I already tried:
search the net for solutions
attached the debugger and enable native exceptions to see there was no hidden crash, sometimes I get an exception in the .Net framework, but the app seems to continue.
try to reproduce in a small application, no success
The most common situation I encounter where this occurs is a program crash. In certain circumstances crashes can happen silently from an end user perspective. I would attach a debugger to the program, set it to break on all native exceptions and run the scenario.
It's possible that someone is calling TerminateProcess, which unlike ExitProcess does not notify DLLs of shutdown.
This might be helpful:
What happens to global variables declared in a DLL?
Are your globals declared inside a dll, or in your application's memory space?
Calling theexit API often means that the application exits without calling destructors. I an not sure that VC does in this case.
Also try avoid using global objects. This is because you have little control over when and in what order the constructors and destructors are called. Rather convert the objects into pointers and initialise and destroy the pointers using the appropriate DllMain hooks. Since DllMain is an OS construct and not a language construct, it should prove more reliable in the case of normal exits.
Related
I work on a product that's usually built as a shared library.
The using application will load it, create some handles, use them, and eventually free all the handles and unload the library.
The library creates some background threads which are usually stopped at the point the handles are freed.
Now, the issue is that some consuming applications aren't super well-behaved, and will fail to free the handles in some cases (cancellation, errors, etc). Eventually, static destructors in our library run, and crash when they try to interact with the (now dead) background thread(s).
One possibility is to not have any global objects with destructors, and so to avoid running any code in the library during static destruction. This would probably solve the crash on process exit, but it would introduce leaks and crashes in the scenario where the application simply unloads the library without freeing the handles (as opposed to exiting), as we wouldn't ensure that the background threads are actually stopped before the code they were running was unloaded.
More importantly, to my knowledge, when main() exits, all other threads will be killed, wherever they happened to be at the time, which could leave locks locked, and invariants broken (for example, within the heap manager).
Given that, does it even make sense to try and support these buggy applications?
Yes, your library should allow the process to exit without warning. Perhaps in an ideal world every program using your library would carefully track the handles and free them all when it exits for any reason, but in practice this isn't a realistic requirement. The code path that is triggering the program exit might be a shared component that isn't even aware that your library is in use!
In any case, it is likely that your current architecture has a more general problem, because it is inherently unsafe for static destructors to interact with other threads.
From DllMain entry point in MSDN:
Because DLL notifications are serialized, entry-point functions should not attempt to communicate with other threads or processes. Deadlocks may occur as a result.
and
If your DLL is linked with the C run-time library (CRT), the entry point provided by the CRT calls the constructors and destructors for global and static C++ objects. Therefore, these restrictions for DllMain also apply to constructors and destructors and any code that is called from them.
In particular, if your destructors attempt to wait for your threads to exit, that is almost certain to deadlock in the case where the library is explicitly unloaded while the threads are still running. If the destructors don't wait, the process will crash when the code the threads are running disappears. I'm not sure why you aren't seeing that problem already; perhaps you are terminating the threads? (That's not safe either, although for different reasons.)
There are a number of ways to resolve this problem. Probably the simplest is the one you already mentioned:
One possibility is to not have any global objects with destructors, and so to avoid running any code in the library during static destruction.
You go on to say:
[...] but it would introduce leaks and crashes in the scenario where the application simply unloads the library without freeing the handles [...]
That's not your problem! The library will only be unloaded if the application explicitly chooses to do so; obviously, and unlike the earlier scenario, the code in question knows your library is present, so it is perfectly reasonable for you to require that it close all your handles before doing so.
Ideally, however, you would provide an uninitialization function that closes all the handles automatically, rather than requiring the application to close each handle individually. Explicit initialization and uninitialization functions also allows you to safely set up and free global resources, which is usually more efficient than doing all of your setup and teardown on a per-handle basis and is certainly safer than using global objects.
(See the link above for a full description of all the restrictions applicable to static constructors and destructors; they are quite extensive. Constructing all your globals in an explicit initialization routine, and destroying them in an explicit uninitialization routine, avoids the whole messy business.)
In cppreference abort, we have
Destructors of variables with automatic, thread local and static storage durations are not called. Functions passed to std::atexit() are also not called. Whether open resources such as files are closed is implementation defined.
I'm bit confused about the terminology and contradiction of the abort term that "closes" my program and from the description of that function which it says that destructors and open resources possibly are not called/closed, respectively. So, is it possible that my program remains running and it has some memory leak or resources still opened after a call to abort()?
It's like killing a person. They won't have a chance to pay any outstanding bills, organize their heritage, clean their apartment, etc.
Whether any of this happens is up to their relatives or other third parties.
So, usually things like open files will be closed and no memory will be leaked because the OS takes care of this (like when the police or so will empty the apartment). There are some platforms where this won't happen, such as 16 bit windows or embedded systems, but under modern windows or Linux systems it will be okay.
However, what definitely won't happen is that destructors are run. This would be like having the to-be-killed person write a last entry into their diary and seal it or something - only the person itself knows how to do it, and they can't when you kill them without warning. So if anything important was supposed to happen in a destructor, it can be problematic, but usually not dramatically - it might be something like that the program created a Temporary file somewhere and would normally delete it on exiting, and now it can't and the file stays.
Still, your program will be closed and not running anymore. It just won't get a chance to clean up things and is therefore depending on the OS to do the right thing and clean up the resources used by it.
Functions passed to std::atexit() are also not called. Whether open resources such as files are closed is implementation defined.
This means the implementation gets to decide what happens. On any common consumer operating system, most objects associated with your process get destroyed when your process exits. That means you won't leak memory that was allocated with new, for example, or open files.
There might be uncommon kinds of objects that aren't freed - for example, if you have a shared memory block, it might stay around in case another process tries to access it. Or if you created a temporary file, intending to delete it later, now the file will stay there because your program isn't around to delete it.
On Unix, calling abort() effectively delivers a SIGABRT signal to your process. The default behavior of the kernel when that signal is delivered is to close down your process, possibly leaving behind a core file, and closing any descriptors. Your process's thread of control is completely removed. Note that this all happens outside any notion of c++ (or any other language). That is why it is considered implementation defined.
If you wish to change the default behavior, you would need to install a signal handler to catch the SIGABRT.
I'm working on a C++ application which uses a library written in C by another team. The writers of the library like to call exit() when errors happen, which ends the program immediately without calling the destructors of objects on the stack in the C++ application. The application sets up some system resources which don't automatically get reclaimed by the operating system after the process ends (shared memory regions, interprocess mutexes, etc), so this is a problem.
I have complete source code for both the app and the library, but the library is very well-established and has no unit tests, so changing it would be a big deal. Is there a way to "hook" the calls to exit() so I can implement graceful shutdown for my app?
One possibility I'm considering is making one big class which is the application - meaning all cleanup would happen either in its destructor or in the destructor of one of its members - then allocating one of these big objects on the heap in main(), setting a global pointer to point to it, and using atexit() to register a handler which simply deletes the object via the global pointer. Is that likely to work?
Is there a known good way to approach this problem?
In the very worst case, you can always write your own implementation of exit and link it rather than the system's own implementation. You can handle the errors there, and optionally call _exit(2) yourself.
Since you have the library source, it's even easier - just add a -Dexit=myExit flag when building it, and then provide an implementation of myExit.
install exit handler with atexit and implement the desired behavior
If you want to make the C library more usable from C++, you could perhaps run it in a separate process. Then make sure (with an exit handler or otherwise) that when it exits, your main application process notices and throws an exception to unwind its own stack. Perhaps in some cases it could handle the error in a non-fatal way.
Of course, moving the library use into another process might not be easy or particularly efficient. You'll have some work to do to wrap the interface, and to copy inputs and outputs via the IPC mechanism of your choice.
As a workaround to use the library from your main process, though, I think the one you describe should work. The risk is that you can't identify and isolate everything that needs cleaning up, or that someone in future modifies your application (or another component you use) on the assumption that the stack will get unwound normally.
You could modify the library source to call a runtime- or compile-time-configurable function instead of calling exit(). Then compile the library with exception-handling and implement the function in C++ to throw an exception. The trouble with that is that the library itself probably leaks resources on error, so you'd have to use that exception only to unwind the stack (and maybe do some error reporting). Don't catch it and continue even if the error could be non-fatal as far as your app is concerned.
If the call exit and not assert or abort, there are a few points to get control again:
When calling exit, the destructors for objects with static lifetime (essentially: globals and objects declared with static) are still executed. This means you could set up a (few) global "resource manager" object(s) and release the resources in their destructor(s).
As you already found, you can register hooks with atexit. This is not limited to one. You can register more.
If all else fails, because you have the source of the library, you can play some macro tricks to effectively replace the calls to exit with a function of your own that could, for example, throw an exception.
I am refactoring an old code, and one of the things I'd like to address is the way that errors are handled. I'm well aware of exceptions and how they work, but I'm not entirely sure they're the best solution for the situations I'm trying to handle.
In this code, if things don't validate, there's really no reason or advantage to unwind the stack. We're done. There's no point in trying to save the ship, because it's a non-interactive code that runs in parallel through the Sun Grid Engine. The user can't intervene. What's more, these validation failures don't really represent exceptional circumstances. They're expected.
So how do I best deal with this? One thing I'm not sure I want is an exit point in every class method that can fail. That seems unmaintainable. Am I wrong? Is it acceptable practice to just call exit() or abort() at the failure point in codes like this? Or should I throw an exception all the way back to some generic catch statement in main? What's the advantage?
Throwing an exception to be caught in main and then exiting means your RAII resource objects get cleaned up. On most systems this isn't needed for a lot of resource types. The OS will clean up memory, file handles, etc. (though I've used a system where failing to free memory meant it remained allocated until system restart, so leaking on program exit wasn't a good idea.)
But there are other resource types that you may want to release cleanly such as network or database connections, or a mechanical device you're driving and need to shut down safely. If an application uses a lot of such things then you may prefer to throw an exception to unwind the stack back to main, and then exit.
So the appropriate method of exiting depends on the application. If an application knows it's safe then calling _Exit(), abort(), exit(), or quickexit() may be perfectly reasonable. (Library code shouldn't call these, since obviously the library has no idea whether its safe for every application that will ever use the library.) If there is some critical clean up that must be performed before an application exits but you know it's limited, then the application can register that clean up code via atexit() or at_quick_exit().
So basically decide what you need cleaned up, document it, implement it, and try to make sure it's tested.
It is acceptable to terminate the program if it cannot handle the error gracefully. There are few things you can do:
Call abort() if you need a core dump.
Call exit() if you want to give a chance to run to those routines registered with atexit() (that is most likely to call destructors for global C++ objects).
Call _exit() to terminate a process immediately.
There is nothing wrong with using those functions as long as you understand what you are doing, know your other choices, and choose that path willingly. After all, that's why those functions exist. So if you don't think it makes any sense to try to handle the error or do anything else when it happens - go ahead. What I would probably do is try to log some informative message (say, to syslog), and call _exit. If logging fails - call abort to get a core along the termination.
I'd suggest to call global function
void stopProgram() {
exit(1);
}
Later you can change it's behavior, so it is maintainable.
As you pointed out, having an exit or abort thrown around throughout your code is not maintainable ... additionally, there may be a mechanism in the future that could allow you to recover from an error, or handle an error in a more graceful manner than simply exiting, and if you've already hard-coded this functionality in, then it would be very hard to undo.
Throwing an exception that is caught in main() is your best-bet at this point that will also give you flexibility in the future should you run the code under a different scenario that will allow you to recover from errors, or handle them differently. Additionally, throwing exceptions could help should you decide to add more debugging support, etc., as it will give you spots to implement logging features and record the program state from isolated and maintainable points in the software before you decide let the program exit.
When I exit my C++ program it crashes with errors like:
EAccessViolation with mesage 'Access violation at address 0...
and
Abnormal Program Termination
It is probably caused by some destructor because it happens only when the application exits. I use a few external libraries and cannot find the code that causes it. Is there a function that forces immediate program exit (something like kill in Linux) so that memory would have to be freed by the operating system? I could use this function in app exit event.
I know that it would be a terrible solution because it'd just hide the problem.
I'm just asking out of sheer curiosity, so please don't give me -1 :)
I tried exit(0) from stdlib but it didn't help.
EDIT:
Thanks for your numerous replies:)
I use Builder C++ 6 (I know it's outdated but for some reasons I had to use it). My app uses library to neural networks (FANN). Using the debugger I found that program crashes in:
~neural_net()
{
destroy();
}
destroy() calls multiple time another function fann_safe_free(ptr), that is:
#define fann_safe_free(x) {if(x) { free(x); x = NULL; }}
The library works great, problem only appears when it does cleaning. That's why I asked about so brutal solution. My app is multi-threaded but other threads operate on different data.
I will analyze my code for the n-th time(the bug must be somewhere), thanks for all your tips :)
You should fix the problem.
First step: find at check all functions you register with atexit() (not many I hope)
Second step: find all global variables and check their destructors.
Third Step: find all static function variables check their destructors.
But otherwise you can abort.
Note: abort is for Abnormal program termination.
abort()
The difference: (note letting an application leave the main function is the equivalent of exit())
exit()
Call the functions registered with the atexit(3) function, in the reverse order of their registration. This includes the destruction of all global (static storage duration) variables.
Flush all open output streams.
Close all open streams.
Unlink all files created with the tmpfile(3) function.
abort()
Flush all open output streams.
Close all open streams.
It's a terrible solution for more than one reason. It will hide the problem (maybe), but it could also corrupt data, depending on the nature of your application.
Why don't you use a debugger and try to find out what is causing the error?
If your application is multi-threaded, you should make sure that all threads are properly shut down before exiting the application. This is a fairly common cause of that type of error on exit, when a background thread is attempting to use memory/objects that have already been destructed.
Edit:
based on your updated question, I have the following suggestions:
Try to find out more specifically what is causing the crash in the destructor.
The first thing I would do is make sure that it's not trying to destruct a NULL object. When you get your crash in ~neural_net in your debugger, check your "this" pointer to make sure it's not NULL. If it is, then check your call-stack and see where it's being destructed, and do a check to make sure it's not NULL before calling delete.
If it's not NULL, then I would unroll that macro in destroy, so you can see if it's crashing on the call to free.
You could try calling abort(); (declared in <stdlib.h> and in <process.h>)
The version in VisualC++, however, will print a warning message as it exits: "This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information."
On Linux/UNIX you can use _exit:
#include <unistd.h>
void _exit(int status);
The function _exit() is like exit(), but does not call any functions registered with atexit() or on_exit(). Whether it flushes standard I/O buffers and removes temporary files created with tmpfile(3) is implementation dependent. On the other hand, _exit() does close open file descriptors, and this may cause an unknown delay, waiting for pending output to finish. If the delay is undesired, it may be useful to call functions like tcflush() before calling _exit(). Whether any pending I/O is cancelled, and which pending I/O may be cancelled upon _exit(), is implementation-dependent.
Have you tried the gruesome step by step? If you're project/solution is simply to large to do so maybe you could try segmenting it assuming you use a modular build and test each component indivdually. Without any code or visible destructors abstract advice is all I can give you I'm afraid. But nonetheless I hope trying to minimize the debugging field will help in some way.
Good luck with getting an answer :)
That immediate program exit (and yes, that's a terrible solution) is abort()
That happens most likely because a NULL pointer is being accessed. Depending on your OS try getting a stack trace and identify the culprit, don't just exit.
If you use linux, valgrind should solve your problem.
but if it is windows, try one of these: MemoryValidator, BoundsChecker or other tools like these.
Simply close your application is not the best way to deal with bugs ...