What is the closest thing Windows has to fork()?

What is the closest thing Windows has to fork()? - c++

I guess the question says it all.
I want to fork on Windows. What is the most similar operation and how do I use it.

Cygwin has fully featured fork() on Windows. Thus if using Cygwin is acceptable for you, then the problem is solved in the case performance is not an issue.
Otherwise you can take a look at how Cygwin implements fork(). From a quite old Cygwin's architecture doc:
5.6. Process Creation
The fork call in Cygwin is particularly interesting
because it does not map well on top of
the Win32 API. This makes it very
difficult to implement correctly.
Currently, the Cygwin fork is a
non-copy-on-write implementation
similar to what was present in early
flavors of UNIX.
The first thing that happens when a
parent process forks a child process
is that the parent initializes a space
in the Cygwin process table for the
child. It then creates a suspended
child process using the Win32
CreateProcess call. Next, the parent
process calls setjmp to save its own
context and sets a pointer to this in
a Cygwin shared memory area (shared
among all Cygwin tasks). It then fills
in the child's .data and .bss sections
by copying from its own address space
into the suspended child's address
space. After the child's address space
is initialized, the child is run while
the parent waits on a mutex. The child
discovers it has been forked and
longjumps using the saved jump buffer.
The child then sets the mutex the
parent is waiting on and blocks on
another mutex. This is the signal for
the parent to copy its stack and heap
into the child, after which it
releases the mutex the child is
waiting on and returns from the fork
call. Finally, the child wakes from
blocking on the last mutex, recreates
any memory-mapped areas passed to it
via the shared area, and returns from
fork itself.
While we have some ideas as to how to
speed up our fork implementation by
reducing the number of context
switches between the parent and child
process, fork will almost certainly
always be inefficient under Win32.
Fortunately, in most circumstances the
spawn family of calls provided by
Cygwin can be substituted for a
fork/exec pair with only a little
effort. These calls map cleanly on top
of the Win32 API. As a result, they
are much more efficient. Changing the
compiler's driver program to call
spawn instead of fork was a trivial
change and increased compilation
speeds by twenty to thirty percent in
our tests.
However, spawn and exec present their
own set of difficulties. Because there
is no way to do an actual exec under
Win32, Cygwin has to invent its own
Process IDs (PIDs). As a result, when
a process performs multiple exec
calls, there will be multiple Windows
PIDs associated with a single Cygwin
PID. In some cases, stubs of each of
these Win32 processes may linger,
waiting for their exec'd Cygwin
process to exit.
Sounds like a lot of work, doesn't it? And yes, it is slooooow.
EDIT: the doc is outdated, please see this excellent answer for an update

I certainly don't know the details on this because I've never done it it, but the native NT API has a capability to fork a process (the POSIX subsystem on Windows needs this capability - I'm not sure if the POSIX subsystem is even supported anymore).
A search for ZwCreateProcess() should get you some more details - for example this bit of information from Maxim Shatskih:
The most important parameter here is SectionHandle. If this parameter
is NULL, the kernel will fork the current process. Otherwise, this
parameter must be a handle of the SEC_IMAGE section object created on
the EXE file before calling ZwCreateProcess().
Though note that Corinna Vinschen indicates that Cygwin found using ZwCreateProcess() still unreliable:
Iker Arizmendi wrote:
> Because the Cygwin project relied solely on Win32 APIs its fork
> implementation is non-COW and inefficient in those cases where a fork
> is not followed by exec. It's also rather complex. See here (section
> 5.6) for details:
>
> http://www.redhat.com/support/wpapers/cygnus/cygnus_cygwin/architecture.html
This document is rather old, 10 years or so. While we're still using
Win32 calls to emulate fork, the method has changed noticably.
Especially, we don't create the child process in the suspended state
anymore, unless specific datastructes need a special handling in the
parent before they get copied to the child. In the current 1.5.25
release the only case for a suspended child are open sockets in the
parent. The upcoming 1.7.0 release will not suspend at all.
One reason not to use ZwCreateProcess was that up to the 1.5.25
release we're still supporting Windows 9x users. However, two
attempts to use ZwCreateProcess on NT-based systems failed for one
reason or another.
It would be really nice if this stuff would be better or at all
documented, especially a couple of datastructures and how to connect a
process to a subsystem. While fork is not a Win32 concept, I don't
see that it would be a bad thing to make fork easier to implement.

Well, windows doesn't really have anything quite like it. Especially since fork can be used to conceptually create a thread or a process in *nix.
So, I'd have to say:
CreateProcess()/CreateProcessEx()
and
CreateThread() (I've heard that for C applications, _beginthreadex() is better).

People have tried to implement fork on Windows. This is the closest thing to it I can find:
Taken from: http://doxygen.scilab.org/5.3/d0/d8f/forkWindows_8c_source.html#l00216
static BOOL haveLoadedFunctionsForFork(void);
int fork(void)
{
HANDLE hProcess = 0, hThread = 0;
OBJECT_ATTRIBUTES oa = { sizeof(oa) };
MEMORY_BASIC_INFORMATION mbi;
CLIENT_ID cid;
USER_STACK stack;
PNT_TIB tib;
THREAD_BASIC_INFORMATION tbi;
CONTEXT context = {
CONTEXT_FULL |
CONTEXT_DEBUG_REGISTERS |
CONTEXT_FLOATING_POINT
};
if (setjmp(jenv) != 0) return 0; /* return as a child */
/* check whether the entry points are
initilized and get them if necessary */
if (!ZwCreateProcess && !haveLoadedFunctionsForFork()) return -1;
/* create forked process */
ZwCreateProcess(&hProcess, PROCESS_ALL_ACCESS, &oa,
NtCurrentProcess(), TRUE, 0, 0, 0);
/* set the Eip for the child process to our child function */
ZwGetContextThread(NtCurrentThread(), &context);
/* In x64 the Eip and Esp are not present,
their x64 counterparts are Rip and Rsp respectively. */
#if _WIN64
context.Rip = (ULONG)child_entry;
#else
context.Eip = (ULONG)child_entry;
#endif
#if _WIN64
ZwQueryVirtualMemory(NtCurrentProcess(), (PVOID)context.Rsp,
MemoryBasicInformation, &mbi, sizeof mbi, 0);
#else
ZwQueryVirtualMemory(NtCurrentProcess(), (PVOID)context.Esp,
MemoryBasicInformation, &mbi, sizeof mbi, 0);
#endif
stack.FixedStackBase = 0;
stack.FixedStackLimit = 0;
stack.ExpandableStackBase = (PCHAR)mbi.BaseAddress + mbi.RegionSize;
stack.ExpandableStackLimit = mbi.BaseAddress;
stack.ExpandableStackBottom = mbi.AllocationBase;
/* create thread using the modified context and stack */
ZwCreateThread(&hThread, THREAD_ALL_ACCESS, &oa, hProcess,
&cid, &context, &stack, TRUE);
/* copy exception table */
ZwQueryInformationThread(NtCurrentThread(), ThreadBasicInformation,
&tbi, sizeof tbi, 0);
tib = (PNT_TIB)tbi.TebBaseAddress;
ZwQueryInformationThread(hThread, ThreadBasicInformation,
&tbi, sizeof tbi, 0);
ZwWriteVirtualMemory(hProcess, tbi.TebBaseAddress,
&tib->ExceptionList, sizeof tib->ExceptionList, 0);
/* start (resume really) the child */
ZwResumeThread(hThread, 0);
/* clean up */
ZwClose(hThread);
ZwClose(hProcess);
/* exit with child's pid */
return (int)cid.UniqueProcess;
}
static BOOL haveLoadedFunctionsForFork(void)
{
HANDLE ntdll = GetModuleHandle("ntdll");
if (ntdll == NULL) return FALSE;
if (ZwCreateProcess && ZwQuerySystemInformation && ZwQueryVirtualMemory &&
ZwCreateThread && ZwGetContextThread && ZwResumeThread &&
ZwQueryInformationThread && ZwWriteVirtualMemory && ZwClose)
{
return TRUE;
}
ZwCreateProcess = (ZwCreateProcess_t) GetProcAddress(ntdll,
"ZwCreateProcess");
ZwQuerySystemInformation = (ZwQuerySystemInformation_t)
GetProcAddress(ntdll, "ZwQuerySystemInformation");
ZwQueryVirtualMemory = (ZwQueryVirtualMemory_t)
GetProcAddress(ntdll, "ZwQueryVirtualMemory");
ZwCreateThread = (ZwCreateThread_t)
GetProcAddress(ntdll, "ZwCreateThread");
ZwGetContextThread = (ZwGetContextThread_t)
GetProcAddress(ntdll, "ZwGetContextThread");
ZwResumeThread = (ZwResumeThread_t)
GetProcAddress(ntdll, "ZwResumeThread");
ZwQueryInformationThread = (ZwQueryInformationThread_t)
GetProcAddress(ntdll, "ZwQueryInformationThread");
ZwWriteVirtualMemory = (ZwWriteVirtualMemory_t)
GetProcAddress(ntdll, "ZwWriteVirtualMemory");
ZwClose = (ZwClose_t) GetProcAddress(ntdll, "ZwClose");
if (ZwCreateProcess && ZwQuerySystemInformation && ZwQueryVirtualMemory &&
ZwCreateThread && ZwGetContextThread && ZwResumeThread &&
ZwQueryInformationThread && ZwWriteVirtualMemory && ZwClose)
{
return TRUE;
}
else
{
ZwCreateProcess = NULL;
ZwQuerySystemInformation = NULL;
ZwQueryVirtualMemory = NULL;
ZwCreateThread = NULL;
ZwGetContextThread = NULL;
ZwResumeThread = NULL;
ZwQueryInformationThread = NULL;
ZwWriteVirtualMemory = NULL;
ZwClose = NULL;
}
return FALSE;
}

Prior to Microsoft introducing their new "Linux subsystem for Windows" option, CreateProcess() was the closest thing Windows has to fork(), but Windows requires you to specify an executable to run in that process.
The UNIX process creation is quite different to Windows. Its fork() call basically duplicates the current process almost in total, each in their own address space, and continues running them separately. While the processes themselves are different, they are still running the same program. See here for a good overview of the fork/exec model.
Going back the other way, the equivalent of the Windows CreateProcess() is the fork()/exec() pair of functions in UNIX.
If you were porting software to Windows and you don't mind a translation layer, Cygwin provided the capability that you want but it was rather kludgey.
Of course, with the new Linux subsystem, the closest thing Windows has to fork() is actually fork() :-)

As other answers have mentioned, NT (the kernel underlying modern versions of Windows) has an equivalent of Unix fork(). That's not the problem.
The problem is that cloning a process's entire state is not generally a sane thing to do. This is as true in the Unix world as it is in Windows, but in the Unix world, fork() is used all the time, and libraries are designed to deal with it. Windows libraries aren't.
For example, the system DLLs kernel32.dll and user32.dll maintain a private connection to the Win32 server process csrss.exe. After a fork, there are two processes on the client end of that connection, which is going to cause problems. The child process should inform csrss.exe of its existence and make a new connection – but there's no interface to do that, because these libraries weren't designed with fork() in mind.
So you have two choices. One is to forbid the use of kernel32 and user32 and other libraries that aren't designed to be forked – including any libraries that link directly or indirectly to kernel32 or user32, which is virtually all of them. This means that you can't interact with the Windows desktop at all, and are stuck in your own separate Unixy world. This is the approach taken by the various Unix subsystems for NT.
The other option is to resort to some sort of horrible hack to try to get unaware libraries to work with fork(). That's what Cygwin does. It creates a new process, lets it initialize (including registering itself with csrss.exe), then copies most of the dynamic state over from the old process and hopes for the best. It amazes me that this ever works. It certainly doesn't work reliably – even if it doesn't randomly fail due to an address space conflict, any library you're using may be silently left in a broken state. The claim of the current accepted answer that Cygwin has a "fully-featured fork()" is... dubious.
Summary: In an Interix-like environment, you can fork by calling fork(). Otherwise, please try to wean yourself from the desire to do it. Even if you're targeting Cygwin, don't use fork() unless you absolutely have to.

The following document provides some information on porting code from UNIX to Win32:
https://msdn.microsoft.com/en-us/library/y23kc048.aspx
Among other things, it indicates that the process model is quite different between the two systems and recommends consideration of CreateProcess and CreateThread where fork()-like behavior is required.

"as soon as you want to do file access or printf then io are refused"
You cannot have your cake and eat it too... in msvcrt.dll, printf() is based on the Console API, which in itself uses lpc to communicate with the console subsystem (csrss.exe). Connection with csrss is initiated at process start-up, which means that any process that begins its execution "in the middle" will have that step skipped. Unless you have access to the source code of the operating system, then there is no point in trying to connect to csrss manually. Instead, you should create your own subsystem, and accordingly avoid the console functions in applications that use fork().
once you have implemented your own subsystem, don't forget to also duplicate all of the parent's handles for the child process;-)
"Also, you probably shouldn't use the Zw* functions unless you're in kernel mode, you should probably use the Nt* functions instead."
This is incorrect. When accessed in user mode, there is absolutely no difference between Zw*** Nt***; these are merely two different (ntdll.dll) exported names that refer to the same (relative) virtual address.
ZwGetContextThread(NtCurrentThread(), &context);
obtaining the context of the current (running) thread by calling ZwGetContextThread is wrong, is likely to crash, and (due to the extra system call) is also not the fastest way to accomplishing the task.

Your best options are CreateProcess() or CreateThread(). There is more information on porting here.

There is no easy way to emulate fork() on Windows.
I suggest you to use threads instead.

fork() semantics are necessary where the child needs access to the actual memory state of the parent as of the instant fork() is called. I have a piece of software which relies on the implicit mutex of memory copying as of the instant fork() is called, which makes threads impossible to use. (This is emulated on modern *nix platforms via copy-on-write/update-memory-table semantics.)
The closest that exists on Windows as a syscall is CreateProcess. The best that can be done is for the parent to freeze all other threads during the time that it is copying memory over to the new process's memory space, then thaw them. Neither the Cygwin frok [sic] class nor the Scilab code that Eric des Courtis posted does the thread-freezing, that I can see.
Also, you probably shouldn't use the Zw* functions unless you're in kernel mode, you should probably use the Nt* functions instead. There's an extra branch that checks whether you're in kernel mode and, if not, performs all of the bounds checking and parameter verification that Nt* always do. Thus, it's very slightly less efficient to call them from user mode.

The closest you say... Let me think... This must be fork() I guess :)
For details see Does Interix implement fork()?

Most of the hacky solutions are outdated. Winnie the fuzzer has a version of fork that works on current versions of Windows 10 (tho this requires system specific offsets and can break easily too).
https://github.com/sslab-gatech/winnie/tree/master/forklib

If you only care about creating a subprocess and waiting for it, perhaps _spawn* API's in process.h are sufficient. Here's more information about that:
https://learn.microsoft.com/en-us/cpp/c-runtime-library/process-and-environment-control
https://en.wikipedia.org/wiki/Process.h

Related

Where is the pid stored?

I have the following question in an assignment:
In every one second a process calls the following function:
#include <string>
using namespace std;
string create_file_name(time_t timestamp) {
pid_t pid = getpid();
string s = “results-” + to_string(pid) + to_string(timestamp);
return s;
}
The question is where does the kernel store the process PID?
there are 5 different answers:
user's stack \ kernel's stack \ heap \ PCB \ runqueue
Now generally, I know that the PID is stored inside the PCB but in this case, should it also be stored inside the user's stack? (since it's a local variable).
The question seems to have only one answer, so I am quite confused.

As said from the manpage :
From glibc version 2.3.4 up to and including version 2.24, the glibc
wrapper function for getpid() cached PIDs, with the goal of avoiding
additional system calls when a process calls getpid() repeatedly.
Normally this caching was invisible, but its correct operation relied
on support in the wrapper functions for fork(2), vfork(2), and
clone(2): if an application bypassed the glibc wrappers for these
system calls by using syscall(2), then a call to getpid() in the
child would return the wrong value (to be precise: it would return
the PID of the parent process). In addition, there were cases where
getpid() could return the wrong value even when invoking clone(2) via
the glibc wrapper function. (For a discussion of one such case, see
BUGS in clone(2).) Furthermore, the complexity of the caching code
had been the source of a few bugs within glibc over the years.
Because of the aforementioned problems, since glibc version 2.25, the
PID cache is removed: calls to getpid() always invoke the actual
system call, rather than returning a cached value.
On Alpha, instead of a pair of getpid() and getppid() system calls, a
single getxpid() system call is provided, which returns a pair of PID
and parent PID. The glibc getpid() and getppid() wrapper functions
transparently deal with this. See syscall(2) for details regarding
register mapping.
It depend on the glibc you use. In fact in some version glibc mantains a cache of the pid, while in some versions it repetedly call the syscall to get the pid of the process if you want to know how the system call work is suggest you to see the kernel code.
You can find the getpid() function at this link. ( you can change the kernel version and navigate all the source code to rebuild how the getpid() syscall works.

how can I tell if pthread_self is the main (first) thread in the process?

background: I'm working on a logging library that is used by many programs.
I'm assigning a human-readable name for each thread, the main thread should get "main", but I'd like to be able to detect that state from within the library without requiring code at the beginning of each main() function.
Also note: The library code will not always be entered first from the main thread.

This is kinda doable, depending on the platform you're on, but absolutely not in any portable and generic way...
Mac OS X seems to be the only one with a direct and documented approach, according to their pthread.h file:
/* returns non-zero if the current thread is the main thread */
int pthread_main_np(void);
I also found that FreeBSD has a pthread_np.h header that defines pthread_main_np(), so this should work on FreeBSD too (8.1 at least), and OpenBSD (4.8 at least) has pthread_main_np() defined in pthread.h too. Note that _np explicitly stands for non-portable!
Otherwise, the only more "general" approach that comes to mind is comparing the PID of the process to the TID of the current thread, if they match, that thread is main.
This does not necessarily work on all platforms, it depends on if you can actually get a TID at all (you can't in OpenBSD for example), and if you do, if it has any relation to the PID at all or if the threading subsystem has its own accounting that doesn't necessarily relate.
I also found that some platforms give back constant values as TID for the main thread, so you can just check for those.
A brief summary of platforms I've checked:
Linux: possible here, syscall(SYS_gettid) == getpid() is what you want
FreeBSD: not possible here, thr_self() seems random and without relation to getpid()
OpenBSD: not possible here, there is no way to get a TID
NetBSD: possible here, _lwp_self() always returns 1 for the main thread
Solaris: possible here, pthread_self() always returns 1 for the main thread
So basically you should be able to do it directly on Mac OS X, FreeBSD and OpenBSD.
You can use the TID == PID approach on Linux.
You can use the TID == 1 approach on NetBSD and Solaris.
I hope this helps, have a good day!

Call pthread_self() from main() and record the result. Compare future calls to pthread_self() to your stored value to know if you're on the main thread.

You can utilize some kind of shared name resource that allows threads that spawn to register a name (perhaps a map of thread id to name). Your logging system can then place a call into a method that gets the name via the thread ID in a thread-safe manner.
When the thread dies, have it remove it's name from the mapping to avoid leaking memory.
This method should allow all threads to be named, not just main.

Injecting thread with codecave

By using 'codecave' technique to inject code into another process; is it possible to inject code to create a new thread (and also inject the code for the new thread) and let that thread execute parallel with the target process main thread?
I can manage this with dll injection but I want to know if it is possible with just pure code injection.
The intention is first of all to learn about different injection techniques but in the end create a heartbeat feature for random processes in order to supervise execution (High Availability). Windows is the target OS and language is C/C++ (with inline ASM when required).
Thanks.

There is CreateRemoteThread function.

When using a DLL injection loader such as "Winject (the one that calls CreateRemoteThread) it is very easy to create Threads that remain until the target process closes.
Just create the Thread within the function:
void run_thread(void* ass)
{
// do stuff until process terminates
}
BOOL APIENTRY DllMain(HMODULE hModule, DWORD result, LPVOID lpReserved)
{
HANDLE handle = (HANDLE)_beginthread(run_thread, 0, 0);
}
regards,
michael

Sure, but you would have to also inject the code for the remote thread into the process (e.g. a function). Injecting an entire function into a remote process is a pain because there is no clear-cut way to determine the size of a function. This approach would be far more effective if the injected code was small, in which case you would just inject a short assembly stub, then call CreateRemoteThread.
Really though, what would be a benefit of doing this over just straight-up DLL injection? Your 'heartbeat' feature could be implemented just as easily with an injected DLL. (unless someone is going to tell me there's significant overhead?)

The problem is, even if you inject your code into the process, unless you create a thread at the start of your injected code, it will still not run. Typically, to do code injection you would inject a full DLL. One of popular ways of injecting DLLs is to:
Get a handle to the target process (EnumProcesses, CreateTool32Snapshot/Process32First/Process32Next, FindWindow/GetWindowThreadProcessId/OpenProcess, etc.)
Allocate memory in the target process that is the same length as a string pointing to the path of your DLL (VirtualAllocEx)
Write a string pointing to the path of your DLL to this allocated memory (WriteProcessMemory)
Create a remote thread at the LoadLibrary routine (get the address by GetModuleHandle/GetProcAddress) and pass the pointer to the allocated memory as a parameter (CreateRemoteThread)
Release the allocated memory (VirtualFreeEx)
Close any opened handles (process handles, snapshot handles, etc. with CloseHandle)
Unless there is a particular reason you want to avoid this method, it is by far preferable to copying in the code yourself (WriteProcessMemory and probably setting up page protections (VirtualProtectEx)). Without loading a library you will need to manually map variables, relocate function pointers and all the other work LoadLibrary does.
You asked earlier about the semantics of CreateRemoteThread. It will create a thread in another process which will keep going until it terminates itself or something else does (someone calls TerminateThread or the process terminates and calls ExitProcess, etc.). The thread will run as parallel in the same way a thread that was legitimately created would (context switching).

You can also use the RtlCreateUserThread function to create the remote thread.

Side effects of exit() without exiting?

If my application runs out of memory, I would like to re-run it with changed parameters. I have malloc / new in various parts of the application, the sizes of which are not known in advance. I see two options:
Track all memory allocations and write a restarting procedure which deallocates all before re-running with changed parameters. (Of course, I free memory at the appropriate places if no errors occur)
Restarting the application (e.g., with WinExec() on Windows) and exiting
I am not thrilled by either solution. Did I miss an alternative maybe.
Thanks

You could embedd all the application functionality in a class. Then let it throw an expection when it runs out of memory. This exception would be catched by your application and then you could simply destroy the class, construct a new one and try again. All in one application in one run, no need for restarts. Of course this might not be so easy, depending on what your application does...

There is another option, one I have used in the past, however it requires having planned for it from the beginning, and it's not for the library-dependent programmer:
Create your own heap. It's a lot simpler to destroy a heap than to cleanup after yourself.
Doing so requires that your application is heap-aware. That means that all memory allocations have to go to that heap and not the default one. In C++ you can simply override the static new/delete operators which takes care of everything your code allocates, but you have to be VERY aware of how your libraries, even the standard library, use memory. It's not as simple as "never call a library method that allocates memory". You have to consider each library method on a case-by-case basis.
It sounds like you've already built your app and are looking for a shortcut to memory wiping. If that is the case, this will not help as you could never tack this kind of thing onto an already built application.

The wrapper-program (as proposed before) does not need to be a seperate executable. You could just fork, run your program and then test the return code of the child. This would have the additional benefit, that the operating system automatically reclaims the child's memory when it dies. (at least I think so)
Anyway, I imagined something like this (this is C, you might have to change the includes for C++):
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#define OUT_OF_MEMORY 99999 /* or whatever */
int main(void)
{
int pid, status;
fork_entry:
pid = fork();
if (pid == 0) {
/* child - call the main function of your program here */
} else if (pid > 0) {
/* parent (supervisor) */
wait(&status); /* waiting for the child to terminate */
/* see if child exited normally
(i.e. by calling exit(), _exit() or by returning from main()) */
if (WIFEXITED(status)) {
/* if so, we can get the status code */
if (WEXITSTATUS(status) == OUT_OF_MEMORY) {
/* change parameters */
goto fork_entry; /* forking again */
}
}
} else {
/* fork() error */
return 1;
}
return 0;
}
This might not be the most elegant solution/workaround/hack, but it's easy to do.

A way to accomplish this:
Define an exit status, perhaps like this:
static const int OUT_OF_MEMORY=9999;
Set up a new handler and have it do this:
exit(OUT_OF_MEMORY);
Then just wrap your program with another program that detects this
exit status. When it does then it can rerun the program.
Granted this is more of a workaround than a solution...
The wrapper program I mentioned above could be something like this:
static int special_code = 9999;
int main()
{
const char* command = "whatever";
int status = system(command);
while ( status == 9999 )
{
command = ...;
status = system(command);
}
return 0;
}
That's the basicness of it. I would use std::string instead of char* in production. I'd probably also have another condition for breaking out of the while loop, some maximum number of tries perhaps.
Whatever the case, I think the fork/exec route mentioned below is pretty solid, and I'm pretty sure a solution like it could be created for Windows using spawn and its brethren.

simplicity rules: just restart your app with different parameters.
it is very hard to either track down all allocs/deallocs and clean up the memory (just forget some minor blocks inside bigger chunks [fragmentation] and you still have problems to rerun the class), or to do introduce your own heap-management (very clever people have invested years to bring nedmalloc etc to live, do not fool yourself into the illusion this is an easy task).
so:
catch "out of memory" somehow (signals, or std::bad_alloc, or whatever)
create a new process of your app:
windows: CreateProcess() (you can just exit() your program after this, which cleans up all allocated resources for you)
unix: exec() (replaces the current process completely, so it "cleans up all the memory" for you)
done.

Be warned that on Linux, by default, your program can request more memory than the system has available. (This is done for a number of reasons, e.g. avoiding memory duplication when fork()ing a program into two with identical data, when most of the data will remain untouched.) Memory pages for this data won't be reserved by the system until you try to write in every page you've allocated.
Since there's no good way to report this (since any memory write can cause your system to run out memory), your process will be terminated by the out of memory process killer, and you won't have the information or opportunity for your process to restart itself with different parameters.
You can change the default by using the setrlimit system call, to to limit the RLIMIT_RSS which limits the total amount of memory your process can request. Only after you have done this will malloc return NULL or new throw a std::bad_alloc exception when you reach the limit that you have set.
Be aware that on a heavily loaded system, other processes can still contribute to a systemwide out of memory condition that could cause your program to be killed without malloc or new raising an error, but if you manage the system well, this can be avoided.

Can I use execvp() on a function defined inside my program?

I have a C++ function that I'd like to call using execvp(), due to the way my program is organized.
Is this possible?

All of the exec variants including execvp() can only call complete programs visible in the filesystem. The good news is that if you want to call a function in your already loaded program, all you need is fork(). It will look something like this pseudo-code:
int pid = fork();
if (pid == 0) {
// Call your function here. This is a new process and any
// changes you make will not be reflected back into the parent
// variables. Be careful with files and shared resources like
// database connections.
_exit(0);
}
else if (pid == -1) {
// An error happened and the fork() failed. This is a very rare
// error, but you must handle it.
}
else {
// Wait for the child to finish. You can use a signal handler
// to catch it later if the child will take a long time.
waitpid(pid, ...);
}

excecvp() is meant ot start a program not a function. So you'll have to wrap that function into a compiled executable file and then have that file's main call your function.

Creating processes can be heavyweight. If you really only want to call your function in parallel why not using threads. There are many platform independent libraries available that have threading support for C++ like Boost, QT or ACE.
If you really need your function to be executed in another process you can use fork or vfork. vfork may not be available on every platform and it has it's drawbacks so make sure if you can use it. If not just use fork.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js