If my application runs out of memory, I would like to re-run it with changed parameters. I have malloc / new in various parts of the application, the sizes of which are not known in advance. I see two options:
Track all memory allocations and write a restarting procedure which deallocates all before re-running with changed parameters. (Of course, I free memory at the appropriate places if no errors occur)
Restarting the application (e.g., with WinExec() on Windows) and exiting
I am not thrilled by either solution. Did I miss an alternative maybe.
Thanks
You could embedd all the application functionality in a class. Then let it throw an expection when it runs out of memory. This exception would be catched by your application and then you could simply destroy the class, construct a new one and try again. All in one application in one run, no need for restarts. Of course this might not be so easy, depending on what your application does...
There is another option, one I have used in the past, however it requires having planned for it from the beginning, and it's not for the library-dependent programmer:
Create your own heap. It's a lot simpler to destroy a heap than to cleanup after yourself.
Doing so requires that your application is heap-aware. That means that all memory allocations have to go to that heap and not the default one. In C++ you can simply override the static new/delete operators which takes care of everything your code allocates, but you have to be VERY aware of how your libraries, even the standard library, use memory. It's not as simple as "never call a library method that allocates memory". You have to consider each library method on a case-by-case basis.
It sounds like you've already built your app and are looking for a shortcut to memory wiping. If that is the case, this will not help as you could never tack this kind of thing onto an already built application.
The wrapper-program (as proposed before) does not need to be a seperate executable. You could just fork, run your program and then test the return code of the child. This would have the additional benefit, that the operating system automatically reclaims the child's memory when it dies. (at least I think so)
Anyway, I imagined something like this (this is C, you might have to change the includes for C++):
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#define OUT_OF_MEMORY 99999 /* or whatever */
int main(void)
{
int pid, status;
fork_entry:
pid = fork();
if (pid == 0) {
/* child - call the main function of your program here */
} else if (pid > 0) {
/* parent (supervisor) */
wait(&status); /* waiting for the child to terminate */
/* see if child exited normally
(i.e. by calling exit(), _exit() or by returning from main()) */
if (WIFEXITED(status)) {
/* if so, we can get the status code */
if (WEXITSTATUS(status) == OUT_OF_MEMORY) {
/* change parameters */
goto fork_entry; /* forking again */
}
}
} else {
/* fork() error */
return 1;
}
return 0;
}
This might not be the most elegant solution/workaround/hack, but it's easy to do.
A way to accomplish this:
Define an exit status, perhaps like this:
static const int OUT_OF_MEMORY=9999;
Set up a new handler and have it do this:
exit(OUT_OF_MEMORY);
Then just wrap your program with another program that detects this
exit status. When it does then it can rerun the program.
Granted this is more of a workaround than a solution...
The wrapper program I mentioned above could be something like this:
static int special_code = 9999;
int main()
{
const char* command = "whatever";
int status = system(command);
while ( status == 9999 )
{
command = ...;
status = system(command);
}
return 0;
}
That's the basicness of it. I would use std::string instead of char* in production. I'd probably also have another condition for breaking out of the while loop, some maximum number of tries perhaps.
Whatever the case, I think the fork/exec route mentioned below is pretty solid, and I'm pretty sure a solution like it could be created for Windows using spawn and its brethren.
simplicity rules: just restart your app with different parameters.
it is very hard to either track down all allocs/deallocs and clean up the memory (just forget some minor blocks inside bigger chunks [fragmentation] and you still have problems to rerun the class), or to do introduce your own heap-management (very clever people have invested years to bring nedmalloc etc to live, do not fool yourself into the illusion this is an easy task).
so:
catch "out of memory" somehow (signals, or std::bad_alloc, or whatever)
create a new process of your app:
windows: CreateProcess() (you can just exit() your program after this, which cleans up all allocated resources for you)
unix: exec() (replaces the current process completely, so it "cleans up all the memory" for you)
done.
Be warned that on Linux, by default, your program can request more memory than the system has available. (This is done for a number of reasons, e.g. avoiding memory duplication when fork()ing a program into two with identical data, when most of the data will remain untouched.) Memory pages for this data won't be reserved by the system until you try to write in every page you've allocated.
Since there's no good way to report this (since any memory write can cause your system to run out memory), your process will be terminated by the out of memory process killer, and you won't have the information or opportunity for your process to restart itself with different parameters.
You can change the default by using the setrlimit system call, to to limit the RLIMIT_RSS which limits the total amount of memory your process can request. Only after you have done this will malloc return NULL or new throw a std::bad_alloc exception when you reach the limit that you have set.
Be aware that on a heavily loaded system, other processes can still contribute to a systemwide out of memory condition that could cause your program to be killed without malloc or new raising an error, but if you manage the system well, this can be avoided.
Related
I am using a memory pool to create a lot of objects. All my objects derive from a Base class which has its new/delete over ridden to use my memory pool, basically they call pool.allocate(size).
What I would like to do is when the pool runs out of memory (there is still available memory in the system to function) I would like to set everything back to the beginning. I am thinking of setting a label right after main and goto label when allocation fails, reset the pool and start again.
All non stack allocation are handled by the memory pool. Is this a sensible way to achieve this? Are there gonna be any problems down the line?
EDIT:
This is running on an embedded platform so no OS no exceptions. I am trying to achieve a controlled restart instead of a crash from out of memory. Pool is big enough to hold calculations, I am trying to have a controlled crash in case some functions goes awry.
There is no state to be saved from run to run. I am trying to achive the process of hitting the reset button with software. So I can reset back to start of main notify the app about the restart.
I once did a similar thing using setjmp()/longjmp(). It's not perfect or devoid of problems, but, for the most part it works. Like:
jmp_buf g_env;
int main()
{
int val = setjmp(g_env);
if (val) {
// restarting, do what you need to do
}
// initialize your program and go to work
}
/// where you want to restart:
longjmp(g_env, 101); /// 101 or some other "error" code
This is a goto really, so, remember to do any cleanup yourself.
The first thing which comes in mind is throwing an exception. This is actually how default implementation of operator new behaves: it throws std::bad_alloc.
goto will work only if whole your program is limited to a single function. I see possible implementation of main as follows:
int main(int argc, const char* argv[]) {
while(true) {
try {
do_things(argc, argv);
} catch(const MyBadAlloc& ex) {
do_cleanup();
continue;
}
}
}
I'm trying to check when the console is closed through the close button on Windows. I read about SetConsoleCtrlHandler and I thought I'd use that, but there's some cleanup I want to do in my main function. I'll make a small example describing what I want to do for my larger program.
BOOL CtrlHandler( DWORD fdwCtrlType )
{
switch( fdwCtrlType )
{
//Cleanup exit
case CTRL_CLOSE_EVENT:
bool* programIsOn = &???; //How do I pass the address to that variable in this function?
*programIsOn = false;
return( TRUE );
default:
return FALSE;
}
}
int main(){
MyObject obj = new MyObject();
bool programIsOn = true;
//How do I pass the address of programIsOn here?
if(!SetConsoleCtrlHandler( (PHANDLER_ROUTINE) CtrlHandler, TRUE )){
cout << "Could not set CtrlHandler. Exiting." << endl;
return 0;
}
while(programIsOn){
//...
}
//CLEANUP HERE
delete obj;
return 0;
}
I want to perform cleanup when my program closes via the console close event, however if I just close the console the main function doesn't terminate and is forced to stop. I thought of passing in programIsOn's address to the CtrlHandler callback but I have no idea how to do this without using a global variable.
TL;DR: Proper handling of this control signal is complicated. Don't bother with any 'clean-up' unless it's absolutely necessary.
The system creates a new thread (see the Remarks) in your application, which is then used to execute the handler function you registered. That immediately causes a few issues and forces you in a particular design direction.
Namely, your program suddenly became multi-threaded, with all the complications that brings. Just setting a 'program should stop' (global) boolean variable to true in the handler is not going to work; this has to be done in a thread-aware manner.
Another complication this handler brings is that the moment it returns the program is terminated as per a call to ExitProcess. This means that the handler should wait for the program to finish, again in a thread-aware manner. Queue the next complication, where the OS gives you only 10 seconds to respond to the handler before the program is terminated anyway.
The biggest issue here, I think, is that all these issues force your program to be designed in a very particular way that potentially permeates every nook and cranny of your code.
It's not necessary for your program to clean up any handles, objects, locks or memory it uses: these will all be cleaned up by Windows when your program exits.
Therefore, your clean-up code should consists solely of those operations that need to happen and otherwise wouldn't happen, such as write the end of a log file, delete temporary files, etc.
In fact, it is recommended to not perform such clean-up, as it only slows down the closing of the application and can be so hard to get right in 'unexpected termination' cases; The Old New Thing has a wonderful post about it that's also relevant to this situation.
There are two general choices here for the way to handle the remaining clean-up:
The handler routine does all the clean-up, or
the main application does all the clean-up.
Number 1 has the issue that it's very hard to determine what clean-up to perform (as this depends on where the main program is currently executing) and it's doing so 'while the engine is still running'. Number 2 means that every piece of code in the the main application needs to be aware of the possibility of termination and have short-circuit code to handle such.
So if you truly must, necessarily, absolutely, perform some additional clean-up, choose method 2. Add a global variable, preferably a std::atomic<bool> if C++11 is available to you, and use that to track whether or not the program should exit. Have the handler set it to true
// Shared global variable to track forced termination.
std::atomic<bool> programShouldExit = false;
// In the console handler:
BOOL WINAPI CtrlHandler( DWORD fdwCtrlType )
{
...
programShouldExit = true;
Sleep(10000); // Sleep for 10 seconds; after this returns the program will be terminated if it hasn't already.
}
// In the main application, regular checks should be made:
if (programShouldExit.load())
{
// Short-circuit execution, such as return from function, throw exception, etc.
}
Where you can pick your favourite short-circuiting method, for instance throwing an exception and using the RAII pattern to guard resources.
In the console handler, we sleep for as long as we think we can get away with (it doesn't really matter); the hope is that the main thread will have exited by then causing the application to exit. If not, either the sleep ends, the handler returns and the application is closed, or the OS became impatient and killed the process.
Conclusion: Don't bother with clean-up. Even if there is something you prefer to have done, such as deleting temporary files, I'd recommend you don't. It's truly not worth the hassle (but that's my opinion). If you really must, then use thread-safe means to notify the main thread that it must exit. Modify all longer-running code to handle the exit status and all other code to handle the failure of the longer-running code. Exceptions and RAII can be used to make this more manageable, for instance.
And this is why I feel that it's a very poor design choice, born from legacy code. Just being able to handle an 'exit request' requires you to jump through hoops.
It's more a philosophical type of question.
In C++ we have nice shiny idiom - RAII. But often I see it as incomplete. It does not well aligns with the fact that my application can be killed with SIGSEGV.
I know, I know, programms like that are malformed you say. But there is sad fact that on POSIX (specifically Linux) you can allocate beyond physical memory limits and meet SIGSEGV in the middle of the execution, working with correctly allocated memory.
You may say: "Application dies, why should you care about those poor destructors not being called?". Unfortunately there are some resources that are not automatically freed when application terminates, such as File System entities.
I am pretty sick right now of designing hacks, breaking good application design just to cope with this. So, what I am asking is for a nice, elegant solution to this kind of problems.
Edit:
It seems that I was wrong, and on Linux applications are killed by a kernel pager. In which case the question is still the same, but the cause of application death is different.
Code snippet:
struct UnlinkGuard
{
UnlinkGuard(const std::string path_to_file)
: _path_to_file(path_to_file)
{ }
~UnlinkGuard() {
unlink();
}
bool unlink() {
if (_path_to_file.empty())
return true;
if (::unlink(_path_to_file.c_str())) {
/// Probably some logging.
return false;
}
disengage();
return true;
}
void disengage() {
_path_to_file.clear();
}
private:
std::string _path_to_file;
};
void foo()
{
/// Pick path to temp file.
std::string path_to_temp_file = "...";
/// Create file.
/// ...
/// Set up unlink guard.
UnlinkGuard unlink_guard(path_to_temp_file);
/// Call some potentially unsafe library function that can cause process to be killed either:
/// * by a SIGSEGV
/// * by out of memory
/// ...
/// Work done, file content is appropriate.
/// Rename tmp file.
/// ...
/// Disengage unlink guard.
unlink_guard.disengage();
}
On success I use file. On failure I want this file to be missing.
This could be achived if POSIX had support for link()-ing of previously unlinked file by file descriptor, but there is no such feature :(.
So, what I am asking is for a nice, elegant solution to this kind of problems.
None exists, neither for C++ nor for other languages. You are faced with a fundamental physical reality here, not a design decision: what happens when the user pulls the plug? No programming solution can guard against that (well, there’s restore-upon-restart).
What you can do is catch POSIX signals and sometimes you can even handle them – but it’s flakey and there are tons of caveats, which another discussion on Stack Overflow details.
Most resources should not be cleared up after a segfault. If you want to do it anyway, simply collect those resources (or rather, handlers for their cleanup) in a global array, trap SIGSEGV, iterate through the cleanup routine array in the handler (hoping that the relevant memory is still intact), and perform the cleanup.
More specifically, for temporary files it helps to create them inside one of the system’s temporary folders. It’s understood that these don’t always get cleaned up by their respective applications, and either the system or the user will periodically perform cleanup instead.
Usually the solution, regardless of language or OS, is to clean up when you start the program, not (only) when you terminate. If your program can create temporary files that it cleans up on shutdown, clean up the temporary files when you start the program too.
Most everything else, like file handles, tcp connections, and so forth, is killed by the OS when your application dies.
I a have third party function which I use in my program. I can't replace it; it's in a dynamic library, so I also can't edit it. The problem is that it sometimes runs for too long.
So, can I do anything to stop this function from running if it runs more than 10 seconds for example? (It's OK to close program in this scenario.)
PS. I have Linux, and this program won't have to be ported anywhere else.
What I want is something like this:
#include <stdio.h>
#include <stdlib.h>
void func1 (void) // I can not change contents of this.
{
int i; // random
while (i % 2 == 0);
}
int main ()
{
setTryTime(10000);
timeTry{
func1();
} catchTime {
puts("function executed too long, aborting..");
}
return 0;
}
Sure. And you'd do it just the way you suggested in your title: "signals".
Specifically, an "alarm" signal:
http://linux.die.net/man/2/alarm
http://beej.us/guide/bgipc/output/html/multipage/signals.html
If you really have to do this, you probably want to spawn a process that does nothing but invoke the function and return its result to the caller. If it runs too long, you can kill that process.
By putting it into its own process, you stand a decent (not great, but decent) chance of cleaning up at least most of what it was doing so when it dies unexpectedly it probably won't make a complete mess of things that will lead to later problem.
The potential problem with forcefully cancelling a running function is that it may "own" resources that it intended to return later. The kind of resources that can be problems include:
heap memory allocations (free store)
shared memory segments
threads
sockets
file handles
locks
Some of these resources are managed on a per-process basis, so letting the function run in a different process (perhaps using fork) makes it easier to kill cleanly. Other resources can outlive a process, and really must be cleaned up explicitly. Depending on your operating system, it's also possible that the function may be part-way through interacting with some hardware driver or device, and killing it unexpectedly may leave that driver or device in a bizarre state such that it won't work until after a restart.
If you happen to know that the function doesn't use any of these kind of resources, then you can kill it confidently. But, it's hard to guarantee that: in a large system with many such decisions - which the compiler can't check - evolution of code in functions like func1() is likely to introduce dependencies on such resources.
If you must do this, I'd suggest running it in a different process or thread, and using kill() for processes, pthread_kill if func1() has some support for terminating when a flag is set asynchronously, or the non-portable pthread_cancel if there's really no other choice.
I'm currently working on an exception-based error reporting system for Windows MSVC++ (9.0) apps (i.e. exception structures & types / inheritance, call stack, error reporting & logging and so on).
My question now is: how to correctly report & log an out-of-memory error?
When this error occurs, e.g. as an bad_alloc thrown by the new op, there may be many "features" unavailable, mostly concerning further memory allocation. Normally, I'd pass the exception to the application if it has been thrown in a lib, and then using message boxes and error log files to report and log it. Another way (mostly for services) is to use the Windows Event Log.
The main problem I have is to assemble an error message.
To provide some error information, I'd like to define a static error message (may be a string literal, better an entry in a message file, then using FormatMessage) and include some run-time info such as a call stack.
The functions / methods necessary for this use either
STL (std::string, std::stringstream, std::ofstream)
CRT (swprintf_s, fwrite)
or Win32 API (StackWalk64, MessageBox, FormatMessage, ReportEvent, WriteFile)
Besides being documented on the MSDN, all of them more (Win32) or less (STL) closed source in Windows, so I don't really know how they behave under low memory problems.
Just to prove there might be problems, I wrote a trivial small app provoking a bad_alloc:
int main()
{
InitErrorReporter();
try
{
for(int i = 0; i < 0xFFFFFFFF; i++)
{
for(int j = 0; j < 0xFFFFFFFF; j++)
{
char* p = new char;
}
}
}catch(bad_alloc& e_b)
{
ReportError(e_b);
}
DeinitErrorReporter();
return 0;
}
Ran two instances w/o debugger attached (in Release config, VS 2008), but "nothing happened", i.e. no error codes from the ReportEvent or WriteFile I used internally in the error reporting. Then, launched one instance with and one w/o debugger and let them try to report their errors one after the other by using a breakpoint on the ReportError line. That worked fine for the instance with the debugger attached (correctly reported & logged the error, even using LocalAlloc w/o problems)! But taskman showed a strange behaviour, where there's a lot of memory freed before the app exits, I suppose when the exception is thrown.
Please consider there may be more than one process [edit] and more than one thread [/edit] consuming much memory, so freeing pre-allocated heap space is not a safe solution to avoid a low memory environment for the process which wants to report the error.
Thank you in advance!
"Freeing pre-allocated heap space...". This was exactly that I thought reading your question. But I think you can try it. Every process has its own virtual memory space. With another processes consuming a lot of memory, this still may work if the whole computer is working.
pre-allocate the buffer(s) you need
link statically and use _beginthreadex instead of CreateThread (otherwise, CRT functions may fail) -- OR -- implement the string concat / i2a yourself
Use MessageBox (MB_SYSTEMMODAL | MB_OK) MSDN mentions this for reporting OOM conditions (and some MS blogger described this behavior as intended: the message box will not allocate memory.)
Logging is harder, at the very least, the log file needs to be open already.
Probably best with FILE_FLAG_NO_BUFFERING and FILE_FLAG_WRITE_THROUGH, to avoid any buffering attempts. The first one requires that writes and your memory buffers are sector aligned (i.e. you need to query GetDiskFreeSpace, align your buffer by that, and write only to "multiple of sector size" file offsets, and in blocks that are multiples of sector size. I am not sure if this is necessary, or helps, but a system-wide OOM where every allocation fails is hard to simulate.
Please consider there may be more than one process consuming much memory, so freeing pre-allocated heap space is not a safe solution to avoid a low memory environment for the process which wants to report the error.
Under Windows (and other modern operating systems), each process has its own address space (aka memory) separate from every other running process. And all of that is separate from the literal RAM in the machine. The operating system has virtualized the process address space away from the physical RAM.
This is how Windows is able to push memory used by processes into the page file on the hard disk without those processes having any knowledge of what happened.
This is also how a single process can allocate more memory than the machine has physical RAM and yet still run. For instance, a program running on a machine with 512MB of RAM could still allocate 1GB of memory. Windows would just couldn't keep all of it in the RAM at the same time and some of it would be in the page file. But the program wouldn't know.
So consequently, if one process allocates memory, it does not cause another process to have less memory to work with. Each process is separate.
Each process only needs to worry about itself. And so the idea of freeing a pre-allocated chunk of memory is actually very viable.
You can't use CRT or MessageBox functions to handle OOM since they might need memory, as you describe. The only truly safe thing you can do is alloc a chunk of memory at startup you can write information into and open a handle to a file or a pipe, then WriteFile to it when you OOM out.