It's more a philosophical type of question.
In C++ we have nice shiny idiom - RAII. But often I see it as incomplete. It does not well aligns with the fact that my application can be killed with SIGSEGV.
I know, I know, programms like that are malformed you say. But there is sad fact that on POSIX (specifically Linux) you can allocate beyond physical memory limits and meet SIGSEGV in the middle of the execution, working with correctly allocated memory.
You may say: "Application dies, why should you care about those poor destructors not being called?". Unfortunately there are some resources that are not automatically freed when application terminates, such as File System entities.
I am pretty sick right now of designing hacks, breaking good application design just to cope with this. So, what I am asking is for a nice, elegant solution to this kind of problems.
Edit:
It seems that I was wrong, and on Linux applications are killed by a kernel pager. In which case the question is still the same, but the cause of application death is different.
Code snippet:
struct UnlinkGuard
{
UnlinkGuard(const std::string path_to_file)
: _path_to_file(path_to_file)
{ }
~UnlinkGuard() {
unlink();
}
bool unlink() {
if (_path_to_file.empty())
return true;
if (::unlink(_path_to_file.c_str())) {
/// Probably some logging.
return false;
}
disengage();
return true;
}
void disengage() {
_path_to_file.clear();
}
private:
std::string _path_to_file;
};
void foo()
{
/// Pick path to temp file.
std::string path_to_temp_file = "...";
/// Create file.
/// ...
/// Set up unlink guard.
UnlinkGuard unlink_guard(path_to_temp_file);
/// Call some potentially unsafe library function that can cause process to be killed either:
/// * by a SIGSEGV
/// * by out of memory
/// ...
/// Work done, file content is appropriate.
/// Rename tmp file.
/// ...
/// Disengage unlink guard.
unlink_guard.disengage();
}
On success I use file. On failure I want this file to be missing.
This could be achived if POSIX had support for link()-ing of previously unlinked file by file descriptor, but there is no such feature :(.
So, what I am asking is for a nice, elegant solution to this kind of problems.
None exists, neither for C++ nor for other languages. You are faced with a fundamental physical reality here, not a design decision: what happens when the user pulls the plug? No programming solution can guard against that (well, there’s restore-upon-restart).
What you can do is catch POSIX signals and sometimes you can even handle them – but it’s flakey and there are tons of caveats, which another discussion on Stack Overflow details.
Most resources should not be cleared up after a segfault. If you want to do it anyway, simply collect those resources (or rather, handlers for their cleanup) in a global array, trap SIGSEGV, iterate through the cleanup routine array in the handler (hoping that the relevant memory is still intact), and perform the cleanup.
More specifically, for temporary files it helps to create them inside one of the system’s temporary folders. It’s understood that these don’t always get cleaned up by their respective applications, and either the system or the user will periodically perform cleanup instead.
Usually the solution, regardless of language or OS, is to clean up when you start the program, not (only) when you terminate. If your program can create temporary files that it cleans up on shutdown, clean up the temporary files when you start the program too.
Most everything else, like file handles, tcp connections, and so forth, is killed by the OS when your application dies.
Related
Ok, the question of the century :)
Before you say or think anything, let me tell you that I've read couple of similar questions about this very topic, but I didn't find clear solution for my problem. My case is specific, and typical for system programmers I think.
I have this situation very often. I hate gotos, don't really know why, probably because everybody is yelling that it's bad. But until now I did not find better solution for my specific scenario, and the way I do it currently might be uglier than the use of goto.
Here is my case: I use C++ (Visual C++) for Windows application development, and quite often I use bunch of APIs in my routines. Suppose following situation:
int MyMemberFunction()
{
// Some code... //
if (!SomeApi())
{
// Cleanup code... //
return -1;
}
// Some code... //
if (!SomeOtherApi())
{
// Cleanup code... //
return -2;
}
// Some more code... //
if (!AnotherApi())
{
// Cleanup code... //
return -3;
}
// More code here... //
return 0; // Success
}
So after each Api I have to check if it succeeded, and abort my function if it did not. For this, I use whole bunch of // Cleanup code... //, often pretty much duplicated, followed by the return statement. The function performs, say, 10 tasks (e.g. uses 10 Apis), and if task #6 fails, I have to clean up the resources created by previous tasks. Note that the cleanup should be done by the function itself, so exception handling cannot be used. Also, I can't see how much-talked RAII can help me in this situation.
The only way I've thought of, is to use goto to make jump from all such failure cases to one cleanup label, placed at the end of the function.
Is there any better way of doing this? Will using goto considered bad practice in such situation? What to do then? Such situation is very typical for me (and for system programmers like me, I believe).
P.S.: Resources that need to be cleanup up, are of different types. There might be memory allocations, various system object handles that need closure, etc.
UPDATE:
I think people still did not get what I wanted (probably I'm explaining badly). I thought the pseudo-code should be enough, but here is the practical example:
I open two files with CreateFile. If this step fails: I have to cleanup the already-open files handle(s), if any. I will later read part of one file and write into another.
I use SetFilePointer to position read pointer in first file. If this step fails: I have to close handles opened by previous step.
I use GetFileSize to get target file size. If api fails, or file size is abnormal, I have to make cleanup: same as in previous step.
I allocate buffer of specified size to read from first file. If memory allocation fails, I have to close file handles again.
I have to use ReadFile to read from first file. If this fails, I have to: release buffer memory, and close file handles.
I use SetFilePointer to position write pointer in second file. If this fails, same cleanup has to be done.
I have to use WriteFile to write to second file. If this fails, bla-bla-bla...
Additionally, suppose I guard this function with critical section, and after I call EnterCriticalSection in the beginning of the function, I have to call LeaveCriticalSection before every return statement.
Now note that this is very simplified example. There might be more resources, and more cleanup to be done - mostly same, but sometimes a little bit different, based on which step did fail. But let's talk within this example: how can I use RAII here?
Use of goto is not needed, it is error prone, resulting in redundant and rather unsafe code.
Use RAII and you do not have to use goto. RAII through smart pointers is ideal for your scenario.
You ensure all your resources in the scope are RAII managed(either use smart pointers or your own resource managing classes for them), whenever a error condition occurs all you have to do is return and RAII will magically free your resources implicitly.
RAII will work for this as long as the resources created by previous tasks are maintained in the form of classes/objects that clean up after themselves. You mentioned memory and system object handles, so let's use those as a starting point.
// non RAII code:
int MyMemberFunction() {
FILE *a = fopen("Something", "r");
if (!task1()) {
fclose(a);
return -1;
}
char *x = new char[128];
if (!task2()) {
delete [] x;
fclose(a);
return -2;
}
}
RAII-based code:
int MyMemberFunction() {
std::ifstream a("Something");
if (!task1())
return -1; // a closed automatically when it goes out of scope
std::vector<char> x(128);
if (!task2())
return -2; // a closed, x released when they go out of scope
return 0; // again, a closed, x released when they go out of scope
}
Also note that if you normally expect things to work, you can write the code to portray that a bit more closely:
int MyMemberFunction() {
bool one, two;
if ((one=task1()) && (two=task2()))
return 0;
// If there was an error, figure out/return the correct error code.
if (!one)
return -1;
return -2;
}
Edit: Although it's fairly unusual, if you really need to use c-style I/O, you can still wrap it up into a C++ class vaguely similar to an iostream:
class stream {
FILE *file;
public:
stream(char const &filename) : file(fopen(filename, "r")) {}
~stream() { fclose(file); }
};
That's obviously simplified (a lot), but the general idea works perfectly fine. There's a less obvious, but generally superior approach: an iostream actually uses a buffer class, using underflow for reading and overflow for writing (again, simplifying, though not as much this time). It's not terribly difficult to write a buffer class that uses a FILE * to handle the reading/writing. Most of the code involved is really little more than a fairly thin translation layer to provide the right names for the functions, rearrange the parameters to the right types and orders, and so on.
In the case of memory, you have two different approaches. One is like this, writing a vector-like class of your own that acts purely as a wrapper around whatever memory management you need to use (new/delete, malloc/free, etc.)
Another approach is to observe that std::vector has an Allocator parameter, so it's really already just a wrapper, and you can designate how it obtains/frees memory. If, for example, you really needed its back-end to be malloc and free, it would be fairly easy to write an Allocator class that used them. This way most of your code would follow normal C++ conventions, using std::vector just like anything else (and still supporting RAII as above). At the same time, you get complete control over the implementation of the memory management, so you can use new/delete, malloc/free, or something directly from the OS if needed (e.g., LocalAlloc/LocalFree on Windows).
Use boost:
scoped_ptr for handles, etc
scoped_lock to control mutex.
Consider this simple class that demonstrates RAII in C++ (From the top of my head):
class X {
public:
X() {
fp = fopen("whatever", "r");
if (fp == NULL)
throw some_exception();
}
~X() {
if (fclose(fp) != 0){
// An error. Now what?
}
}
private:
FILE *fp;
X(X const&) = delete;
X(X&&) = delete;
X& operator=(X const&) = delete;
X& operator=(X&&) = delete;
}
I can't throw an exception in the destructor. I m having an error, but no way to report it. And this example is quite generic: I can do this not only with files, but also with e.g posix threads, graphical resources, ... I note how e.g. the wikipedia RAII page sweeps the whole issue under the rug: http://en.wikipedia.org/wiki/Resource_Acquisition_Is_Initialization
It seems to me that RAII is only usefull if the destruction is guaranteed to happen without error. The only resources known to me with this property is memory. Now it seems to me that e.g. Boehm pretty convincingly debunks the idea of manual memory management is a good idea in any common situation, so where is the advantage in the C++ way of using RAII, ever?
Yes, I know GC is a bit heretic in the C++ world ;-)
RAII, unlike GC, is deterministic. You will know exactly when a resource will be released, as opposed to "sometime in the future it's going to be released", depending on when the GC decides it needs to run again.
Now on to the actual problem you seem to have. This discussion came up in the Lounge<C++> chat room a while ago about what you should do if the destructor of a RAII object might fail.
The conclusion was that the best way would be provide a specific close(), destroy(), or similar member function that gets called by the destructor but can also be called before that, if you want to circumvent the "exception during stack unwinding" problem. It would then set a flag that would stop it from being called in the destructor. std::(i|o)fstream for example does exactly that - it closes the file in its destructor, but also provides a close() method.
This is a straw man argument, because you're not talking about garbage collection (memory deallocation), you're talking about general resource management.
If you misused a garbage collector to close files this way, then you'd have the identical situation: you also could not throw an exception. The same options would be open to you: ignoring the error, or, much better, logging it.
The exact same problem occurs in garbage collection.
However, it's worth noting that if there is no bug in your code nor in the library code which powers your code, deletion of a resource shall never fail. delete never fails unless you corrupted your heap. This is the same story for every resource. Failure to destroy a resource is an application-terminating crash, not a pleasant "handle me" exception.
Exceptions in destructors are never useful for one simple reason: Destructors destruct objects that the running code doesn't need anymore. Any error that happens during their deallocation can be safely handled in a context-agnostic way, like logging, displaying to the user, ignoring or calling std::terminate. The surrounding code doesn't care because it doesn't need the object anymore. Therefore, you don't need to propagate an exception through the stack and abort the current computation.
In your example, fp could be safely pushed into a global queue of non-closeable files and handled later. The calling code can continue without problems.
By this argument, destructors very rarely have to throw. In practice, they really do rarely throw, explaining the widespread use of RAII.
First: you can't really do anything useful with the error if your file object is GCed, and fails to close the FILE*. So the two are equivalent as far as that goes.
Second, the "correct" pattern is as follows:
class X{
FILE *fp;
public:
X(){
fp=fopen("whatever","r");
if(fp==NULL) throw some_exception();
}
~X(){
try {
close();
} catch (const FileError &) {
// perhaps log, or do nothing
}
}
void close() {
if (fp != 0) {
if(fclose(fp)!=0){
// may need to handle EAGAIN and EINTR, otherwise
throw FileError();
}
fp = 0;
}
}
};
Usage:
X x;
// do stuff involving x that might throw
x.close(); // also might throw, but if not then the file is successfully closed
If "do stuff" throws, then it pretty much doesn't matter whether the file handle is closed successfully or not. The operation has failed, so the file is normally useless anyway. Someone higher up the call chain might know what to do about that, depending how the file is used - perhaps it should be deleted, perhaps left alone in its partially-written state. Whatever they do, they must be aware that in addition to the error described by the exception they see, it's possible that the file buffer wasn't flushed.
RAII is used here for managing resources. The file gets closed no matter what. But RAII is not used for detecting whether an operation has succeeded - if you want to do that then you call x.close(). GC is also not used for detecting whether an operation has succeeded, so the two are equal on that count.
You get a similar situation whenever you use RAII in a context where you're defining some kind of transaction -- RAII can roll back an open transaction on an exception, but assuming all goes OK, the programmer must explicitly commit the transaction.
The answer to your question -- the advantage of RAII, and the reason you end up flushing or closing file objects in finally clauses in Java, is that sometimes you want the resource to be cleaned up (as far as it can be) immediately on exit from the scope, so that the next bit of code knows that it has already happened. Mark-sweep GC doesn't guarantee that.
I want to chip in a few more thoughts relating to "RAII" vs. GC. The aspects of using some sort of a close, destroy, finish, whatever function are already explained as is the aspect of deterministic resource release. There are, at least, two more important facilities which are enabled by using destructors and, thus, keeping track of resources in a programmer controlled fashion:
In the RAII world it is possible to have a stale pointer, i.e. a pointer which points to an already destroyed object. What sounds like a Bad Thing actually enables related objects to be located in close proximity in memory. Even if they don't fit onto the same cache-line they would, at least, fit into the memory page. To some extend closer proximity could be achieved by a compacting garbage collector, as well, but in the C++ world this comes naturally and is determined already at compile-time.
Although typically memory is just allocated and released using operators new and delete it is possible to allocate memory e.g. from a pool and arrange for an even compacter memory use of objects known to be related. This can also be used to place objects into dedicated memory areas, e.g. shared memory or other address ranges for special hardware.
Although these uses don't necessarily use RAII techniques directly, they are enabled by the more explicit control over memory. That said, there are also memory uses where garbage collection has a clear advantage e.g. when passing objects between multiple threads. In an ideal world both techniques would be available and C++ is taking some steps to support garbage collection (sometimes referred to as "litter collection" to emphasize that it is trying to give an infinite memory view of the system, i.e. collected objects aren't destroyed but their memory location is reused). The discussions so far don't follow the route taken by C++/CLI of using two different kinds of references and pointers.
Q. When has RAII an advantage over GC?
A. In all the cases where destruction errors are not interesting (i.e. you don't have an effective way to handle those anyway).
Note that even with garbage collection, you'd have to run the 'dispose' (close,release whatever) action manually, so you can just improve the RIIA pattern in the very same way:
class X{
FILE *fp;
X(){
fp=fopen("whatever","r");
if(fp==NULL) throw some_exception();
}
void close()
{
if (!fp)
return;
if(fclose(fp)!=0){
throw some_exception();
}
fp = 0;
}
~X(){
if (fp)
{
if(fclose(fp)!=0){
//An error. You're screwed, just throw or std::terminate
}
}
}
}
Destructors are assumed to be always success. Why not just make sure that fclose won't fail?
You can always do fflush or some other things manually and check error to make sure that fclose will succeed later.
I have a situation where 2 different processes(mine C++, other done by other people in JAVA) are a writer and a reader from some shared data file. So I was trying to avoid race condition by writing a class like this(EDIT:this code is broken, it was just an example)
class ReadStatus
{
bool canRead;
public:
ReadStatus()
{
if (filesystem::exists(noReadFileName))
{
canRead = false;
return;
}
ofstream noWriteFile;
noWriteFile.open (noWriteFileName.c_str());
if ( ! noWriteFile.is_open())
{
canRead = false;
return;
}
boost::this_thread::sleep(boost::posix_time::seconds(1));
if (filesystem::exists(noReadFileName))
{
filesystem::remove(noWriteFileName);
canRead= false;
return;
}
canRead= true;
}
~ReadStatus()
{
if (filesystem::exists(noWriteFileName))
filesystem::remove(noWriteFileName);
}
inline bool OKToRead()
{
return canRead;
}
};
usage:
ReadStatus readStatus; //RAII FTW
if ( ! readStatus.OKToRead())
return;
This is for one program ofc, other will have analogous class.
Idea is:
1. check if other program created his "I'm owner file", if it has break else go to 2.
2. create my "I'm the owner" file, check again if other program created his own, if it has delete my file and break else go to 3.
3. do my reading, then delete mine "I'm the owner file".
Please note that rare occurences when they both dont read or write are OK, but the problem is that I still see a small chance of race conditions because theoretically other program can check for the existence of my lock file, see that there isnt one, then I create mine, other program creates his own, but before FS creates his file I check again, and it isnt there, then disaster occurs. This is why I added the one sec delay, but as a CS nerd I find it unnerving to have code like that running.
Ofc I don't expect anybody here to write me a solution, but I would be happy if someone does know a link to a reliable code that I can use.
P.S. It has to be files, cuz I'm not writing entire project and that is how it is arranged to be done.
P.P.S.: access to data file isn't reader,writer,reader,writer.... it can be reader,reader,writer,writer,writer,reader,writer....
P.P.S: other process is not written in C++ :(, so boost is out of the question.
On Unices the traditional way of doing pure filesystem based locking is to use dedicated lockfiles with mkdir() and rmdir(), which can be created and removed atomically via single system calls. You avoid races by never explicitly testing for the existence of the lock --- instead you always try to take the lock. So:
lock:
while mkdir(lockfile) fails
sleep
unlock:
rmdir(lockfile)
I believe this even works over NFS (which usually sucks for this sort of thing).
However, you probably also want to look into proper file locking, which is loads better; I use F_SETLK/F_UNLCK fcntl locks for this on Linux (note that these are different from flock locks, despite the name of the structure). This allows you to properly block until the lock is released. These locks also get automatically released if the app dies, which is usually a good thing. Plus, these will let you lock your shared file directly without having to have a separate lockfile. This, too, work on NFS.
Windows has very similar file locking functions, and it also has easy to use global named semaphores that are very convenient for synchronisation between processes.
As far as I've seen it, you can't reliably use files as locks for multiple processes. The problem is, while you create the file in one thread, you might get an interrupt and the OS switches to another process because I/O is taking so long. The same holds true for deletion of the lock file.
If you can, take a look at Boost.Interprocess, under the synchronization mechanisms part.
While I'm generally against making API calls which can throw from a constructor/destructor (see docs on boost::filesystem::remove) or making throwing calls without a catch block in general that's not really what you were asking about.
You could check out the Overlapped IO library if this is for windows. Otherwise have you considered using shared memory between the processes instead?
Edit: Just saw the other process was Java. You may still be able to create a named mutex that can be shared between processes and used that to create locks around the file IO bits so they have to take turns writing. Sorry I don't know Java so no I idea if that's more feasible than shared memory.
I'm working on a library where I'm farming various tasks out to some third-party libraries that do some relatively sketchy or dangerous platform-specific work. (In specific, I'm writing a mathematical function parser that calls JIT-compilers, like LLVM or libjit, to build machine code.) In practice, these third-party libraries have a tendency to be crashy (part of this is my fault, of course, but I still want some insurance).
I'd like, then, to be able to very gracefully deal with a job dying horribly -- SIGSEGV, SIGILL, etc. -- without bringing down the rest of my code (or the code of the users calling my library functions). To be clear, I don't care if that particular job can continue (I'm not going to try to repair a crash condition), nor do I really care about the state of the objects after such a crash (I'll discard them immediately if there's a crash). I just want to be able to detect that a crash has occurred, stop the crash from taking out the entire process, stop calling whatever's crashing, and resume execution.
(For a little more context, the code at the moment is a for loop, testing each of the available JIT-compilers. Some of these compilers might crash. If they do, I just want to execute continue; and get on with testing another compiler.)
Currently, I've got a signal()-based implementation that fails pretty horribly; of course, it's undefined behavior to longjmp() out of a signal handler, and signal handlers are pretty much expected to end with exit() or terminate(). Just throwing the code in another thread doesn't help by itself, at least the way I've tested it so far. I also can't hack out a way to make this work using C++ exceptions.
So, what's the best way to insulate a particular set of instructions / thread / job from crashes?
Spawn a new process.
What output do you collect when a job succeeds?
I ask because if the output is low bandwidth I would be tempted to run each job in its own process.
Each of these crashy jobs you fire up has a high chance of corrupting memory used elsewhere in your process.
Processes offer the best protection.
Processes offer the best protection, but it's possible you can't do that.
If your threads' entry points are functions you wrote, (for example, ThreadProc in the Windows world), then you can wrap them in try{...}catch(...) blocks. If you want to communicate that an exception has occurred, then you can communicate specific error codes back to the main thread or use some other mechanism. If you want to log not only that an exception has occured but what that exception was, then you'll need to catch specific exception types and extract diagnostic information from them to communicate back to the main thread. A'la:
int my_tempermental_thread()
{
try
{
// ... magic happens ...
return 0;
}
catch( const std::exception& ex )
{
// ... or maybe it doesn't ...
string reason = ex.what();
tell_main_thread_what_went_wong(reason);
return 1;
}
catch( ... )
{
// ... definitely not magical happenings here ...
tell_main_thread_what_went_wrong("uh, something bad and undefined");
return 2;
}
}
Be aware that if you go this way you run the risk of hosing the host process when the exceptions do occur. You say you're not trying to correct the problem, but how do you know the malignant thread didn't eat your stack for example? Catch-and-ignore is a great way to create horribly confounding bugs.
On Windows, you might be able to use VirtualProtect(YourMemory, PAGE_READONLY) when calling the untrusted code. Any attempt to modify this memory would cause a Structured Exception. You can safely catch this and continue execution. However, memory allocated by that library will of course leak, as will other resources. The Linux equivalent is mprotect(YorMemory, PROT_READ), which causes a SEGV.
Lets say that I have a library which runs 24x7 on certain machines. Even if the code is rock solid, a hardware fault can sooner or later trigger an exception. I would like to have some sort of failsafe in position for events like this. One approach would be to write wrapper functions that encapsulate each api a:
returnCode=DEFAULT;
try
{
returnCode=libraryAPI1();
}
catch(...)
{
returnCode=BAD;
}
return returnCode;
The caller of the library then restarts the whole thread, reinitializes the module if the returnCode is bad.
Things CAN go horribly wrong. E.g.
if the try block(or libraryAPI1()) had:
func1();
char *x=malloc(1000);
func2();
if func2() throws an exception, x will never be freed. On a similar vein, file corruption is a possible outcome.
Could you please tell me what other things can possibly go wrong in this scenario?
This code:
func1();
char *x=malloc(1000);
func2();
Is not C++ code. This is what people refer to as C with classes. It is a style of program that looks like C++ but does not match up to how C++ is used in real life. The reason is; good exception safe C++ code practically never requires the use of pointer (directly) in code as pointers are always contained inside a class specifically designed to manage their lifespan in an exception safe manor (Usually smart pointers or containers).
The C++ equivalent of that code is:
func1();
std::vector<char> x(1000);
func2();
A hardware failure may not lead to a C++ exception. On some systems, hardware exceptions are a completely different mechanism than C++ exceptions. On others, C++ exceptions are built on top of the hardware exception mechanism. So this isn't really a general design question.
If you want to be able to recover, you need to be transactional--each state change needs to run to completion or be backed out completely. RAII is one part of that. As Chris Becke points out in another answer, there's more to state than resource acquisition.
There's a copy-modify-swap idiom that's used a lot for transactions, but that might be way too heavy if you're trying to adapt working code to handle this one-in-a-million case.
If you truly need robustness, then isolate the code into a process. If a hardware fault kills the process, you can have a watchdog restart it. The OS will reclaim the lost resources. Your code would only need to worry about being transactional with persistent state, like stuff saved to files.
Do you have control over libraryAPI implementation ?
If it can fit into OO model, you need to design it using RAII pattern, which guarantees the destructor (who will release acquired resources) to be invoked on exception.
usage of resource-manage-helper such as smart pointer do help too
try
{
someNormalFunction();
cSmartPtr<BYTE> pBuf = malloc(1000);
someExceptionThrowingFunction();
}
catch(...)
{
// Do logging and other necessary actions
// but no cleaning required for <pBuf>
}
The problem with exeptions is - even if you do re-engineer with RAiI - its still easy to make code that becomes desynchronized:
void SomeClass::SomeMethod()
{
this->stateA++;
SomeOtherMethod();
this->stateB++;
}
Now, the example might look artifical, but if you substitue stateA++ and stateB++ for operations that change the state of the class in some way, the expected outcome of this class is for the states to remain in sync. RAII might solve some of the problems associated with state when using exceptions, but all it does is provide a false sense of security - If SomeOtherMethod() throws an exception ALL the surrounding code needs to be analyzed to ensure that the post conditions (stateA.delta == stateB.delta) are met.