How to wait for unknown number of processes to end

How to wait for unknown number of processes to end - c++

The scenario:
There are several processes running on a machine. Names and handles unknown, but they all have a piece of code running in them that's under our control.
A command line process is run. It signals to the other processes that they need to end (SetEvent), which our code picks up and handles within the other processes.
The goal:
The command line process needs to wait until the other processes have ended. How can this be achieved?
All that's coming to mind is to set up some shared memory or something and have each process write its handle into it so the command line process can wait on them, but this seems like so much effort for what it is. There must be some kernel level reference count that can be waited on?
Edit 1:
I'm thinking maybe assigning the processes to a job object, then the command line processes can wait on that? Not ideal though...
Edit 2:
Can't use job objects as it would interfere with other things using jobs. So now I'm thinking that the processes would obtain a handle to some/any sync object (semaphore, event, etc), and the command line process would poll for its existance. It would have to poll as if it waited it would keep the object alive. The sync object gets cleaned up by windows when the processes die, so the next poll would indicate that there are no processes. Not the niceset, cleanest method, but simple enough for the job it needs to do. Any advance on that?

You can do either of following ways.
Shared Memory (memory mapped object) : CreateFileMapping, then MapViewOfFile --> Proceed the request. UnmapViewFile. Close the file,
Named Pipe : Create a nameed pipe for each application. And keep running a thread to read the file. So, You can write end protocol from your application by connecting to that named pipe. ( U can implement a small database as like same )
WinSock : (Dont use if you have more number of processes. Since you need to send end request to the other process. Either the process should bind to your application or it should be listening in a port.)
Create a file/DB : Share the file between the processes. ( You can have multiple files if u needed ). Make locking before reading or writing.

I would consider a solution using two objects:
a shared semaphore object, created by the main (controller?) app, with an initial count of 0, just before requesting the other processes to terminate (calling SetEvent()) - I assume that the other processes don't create this event object, neither they fail if it has not been created yet.
a mutex object, created by the other (child?) processes, used not for waiting on it, but for allowing the main process to check for its existence (if all child processes terminate it should be destroyed). Mutex objects have the distinction that can be "created" by more than one processes (according to the documentation).
Synchronization would be as follows:
The child processes on initialization should create the Mutex object (set initial ownership to FALSE).
The child processes upon receiving the termination request should increase the semaphore count by one (ReleaseSemaphore()) and then exit normally.
The main process would enter a loop calling WaitForSingleObject() on the semaphore with a reasonably small timeout (eg some 250 msec), and then check not whether the object was granted or a timeout has occurred, but whether the mutex still exists - if not, this means that all child processes terminated.
This setup avoids making an interprocess communication scheme (eg having the child processes communicating their handles back - the number of which is unknown anyway), while it's not strictly speaking "polling" either. Well, there is some timeout involved (and some may argue that this alone is polling), but the check is also performed after each process has reported that it's terminating (you can employ some tracing to see how many times the timeout has actually elapsed).

The simple approach: you already have an event object that every subordinate process has open, so you can use that. After setting the event in the master process, close the handle, and then poll until you discover that the event object no longer exists.
The better approach: named pipes as a synchronization object, as already suggested. That sounds complicated, but it isn't.
The idea is that each of the subordinate processes creates an instance of the named pipe (i.e., all with the same name) when starting up. There's no need for a listening thread, or indeed any I/O logic at all; you just need to create the instance using CreateNamedPipe, then throw away the handle without closing it. When the process exits, the handle is closed automatically, and that's all we need.
To see whether there are any subordinate processes, the master process would attempt to connect to that named pipe using CreateFile. If it gets a file not found error, there are no subordinate processes, so we're done.
If the connection succeeded, there's at least one subordinate process that we need to wait for. (When you attempt to connect to a named pipe with more than one available instance, Windows chooses which instance to connect you to. It doesn't matter to us which one it is.)
The master process would then call ReadFile (just a simple synchronous read, one byte will do) and wait for it to fail. Once you've confirmed that the error code is ERROR_BROKEN_PIPE (it will be, unless something has gone seriously wrong) you know that the subordinate process in question has exited. You can then loop around and attempt another connection, until no more subordinate processes remain.
(I'm assuming here that the user will have to intervene if one or more subordinates have hung. It isn't impossible to keep track of the process IDs and do something programmatically if that is desirable, but it's not entirely trivial and should probably be a separate question.)

Related

Obtain thread handles/id of a specific process

I have a multi-threaded embedded architecture that contains 6 application specific processes which are executed when the initialization process is executed. Likewise, each have their own number of threads that are running.
What i want to do is suspend the running threads of 1 particular process based on whether the device is connected to the pc or not.
I have tried searching around and the closest i've found to what im looking for is the following: How to obtain list of thread handles from a win32 process?
However, that code returns the list of all running threads. This wont work for me since im trying to suspend all obtained threads, assuming they have been obtained from the same process, thus i do not check which process they belong too.
Likewise, i am obtaining the list of running threads of a processes in another process.
Is there an existing method from windows that allows such control, or am i stuck with having to identify which threads i need to suspend from the entire list?

Instead of trying to forcefully suspend threads (which is likely to bring you trouble when you suspend in "not so lucky moment") you'd rather use a named CreateEvent() with manual reset.
Named events are easily shared between processes. You simply CreateEvent() again with the same name. The typical name for event would be MyCompany_MyProduct_MyFeature_EventName to prevent accidental collisions.
When you WaitForSingleObject() on "set" event, the wait is immediately satisfied.
When you wait on "reset" event, the wait suspends your thread until event is set.
Your first application will have its thread(s) wait on event when they're not doing any work and therefore safe to suspend.
You will set and reset event from second application to control the first application.
This way, you don't need to enumerate threads, and it's more robust.

unix accept() function returns the same file descriptor twice

I have a problem with my multithreaded networking server program.
I have a main thread that is listening for new client connections. I use Linux epoll to get I/O event notifications. For each incoming event, I create a thread that accept() the new connection and assign a fd to it. Under heavy loading, it can occur that the same fd is assigned twice causing my program to crash.
My question is: how can the system re-assign a fd that is still used by another thread?
Thanks,

Presumably there is a race condition here - but without seeing your code it's hard to diagnose.
You would be better to accept on the Main thread and then pass the accepted socket to the new thread.
If you pass your listening socket to a new thread to then perform the accept - you're going to hit a race condition.
For further information you can look here: https://stackoverflow.com/a/4687952/516138
And this is a good background on networking efficiency (although perhaps a bit out of date).

You should call accept() on the same thread that you are calling epoll() on. Otherwise you are inviting race conditions.

File descriptors are modified in a "per process basis". This means that they are unique for each process. This means that multiple threads can share the same file descriptors in the same process.
Having an accept syscall returning the same file descriptor inside the same process is a very strong indication that some of your threads are closing the previous "version" of the repeated file descriptor.
Issues like this one may be difficult to debug in complex software. A way to identify that in Linux system is to use the strace command. One can run strace -f -e trace=close,accept4,accept,pipe,open <your program>. That's going to output on your screen the respective syscalls specified in the command along with which thread is calling it.

Number of parallel instances of my process (app)

Is there some portable way to check the number of parallel instances of my app?
I have a c++ app (win32) where I need to know how often it was started. The problem is
that several user can start it parallel (terminal server), so i cannot search the "running process" list because I'm not able to access the the list of other users.
I tried it with Semaphore (boost & win32 CreateSemaphore)
It worked, but now I have the problem if the app crashes (Assertion or just kill the process) the counter is not changed. (rebooting helps)
Also manually removing/resetting the semaphore counter in my code is not possible because I don't know if somebody else is running my application.

Edited to add:
Suppose you have a license that lets you run 20 full-functionality copies of your program. Then you could have 20 mutexes, named MyProgMutex1 through MyProgMutex20. At startup, your program can loop through the mutexes. If it finds a spare mutex that it can take, it stops looping and enters full-functionality mode. If it loops through all the mutexes without being able to take any of them, then it enters reduced-functionality mode.
Original answer:
I assume you want to make sure that only one copy of your process runs at once. (Or, for Terminal Server, one copy of your process per login session).
Your named semaphore solution is close. The right way to do this is a named mutex. Use CreateMutex to make the mutex, then call WaitForSingleObject with a timeout of zero. If WaitForSingleObject returns WAIT_TIMEOUT, another copy of the process is running. If it returns WAIT_OBJECT_0 or WAIT_ABANDONED, then you are the only copy of the process. You need to keep the mutex handle open while your program runs - either call CloseHandle when your process is about to exit, or just deliberately leak the handle and rely on Window's built-in cleanup to release the handle for you when your process exits. Windows will automatically increment the mutex's counter when your process exits.

The only thing I can think of that mitigates the problem of crashed processes is a kind of “dead man’s switch”: each process needs to update its status in regular intervals. If a process fails to do this, it’s automatically discarded from the list of active processes.
This technique requires that one of the processes acts as a server which keeps tab of whether other processes have updated recently. If the server dies, then another process can take over. This, in turn, requires that each process tests whether there still is a server alive.
Alternatively, each process can be its own server and keep track locally. This may be easier to implement than server-switching.

You can broadcast message and other instances of your application should send some response. You count responses - you get number of instances.

What is an easy way to test whether any process of a given id is presently running on Linux?

In C++, I have a resource that is tied to a pid. Sometimes the process associated with that pid exits abnormally and leaks the resource.
Therefore, I'm thinking of putting the pid in the file that records the resource as being in use. Then when I go to get a resource, if I see an item as registered as being in use, I would search to see whether a process matching the pid is currently running, and if not, clean up the leaked resource.
I realize there is a very small probability that a new unrealated pid is now sharing the same number, but this is better than leaking with no clean up I have now.
Alternatively, perhaps there is a better solution for this, if so, please suggest, otherwise, I'll pursue the pid recording.
Further details: The resource is a port number for communication between a client and a server over tcp. Only one instance of the client may use a given port number on a machine. The port numbers are taken from a range of available port numbers to use. While the client is running, it notes the port number it is using in a special file on disk and then cleans this entry up on exit. For abnormal exit, this does not always get cleaned up and the port number is left annotated as being in use, when it is no longer being used.

To check for existence of process with a given id, use kill(pid,0) (I assume you are on POSIX system). See man 2 kill for details.
Also, you can use waitpid call to be notified when the process finishes.

I would recommend you use some kind of OS resource, not a PID. Mutexes, semaphores, delete-on-close files. All of these are cleaned up by the OS when a process exits.
On Windows, I would recommend a named mutex.
On Linux, I would recommend using flock on a file.

How about a master process that starts your process (the one which terminates abnormally) waits for your process to crash (waitpid) and spawns it again when waitpid returns.
while(1) {
fork exec
waitpid
}

The problem domain isn't clear, unfortunately, you could try re-explaining it in some other way.
But if I understand you correctly, you could create a map like
std::map< ProcessId, boost::shared_ptr<Resource> > map;
// `Resource` here references to some abstract resource type
// and `ProcessId` on Windows system would be basically a DWORD
and in this case you simply have to list every running process (this can be done via EnumProcesses call on Windows) and remove every entry with inappropriate id from your map. After doing this you would have only valid process-resource pairs left. This action can be repeated every YY seconds depending on your needs.
Note that in this case removing an item from your map would basically call the corresponding destructor (because, if your resource is not being used in your code somewhere else, it's reference count would drop to zero).

The API that achieves that on windows are OpenProcess which takes process ID as input, and GetExitCodeProcess which returns STILL_ACTIVE when the process is, well, still active. You could also use any Wait function with zero timeout, but this API seems somewhat cleaner.
As other answers note, however, this doesn't seem a promising road to take. We might be able to give more focused advice if you provide more scenario details. What is your platform? What is the leaked resource exactly? Do you have access to the leaking app code? Can you wrap it in a high-level try-catch with some cleanup? If not, maybe wait on the leaker to finish with a dedicated thread (or dedicated process altogether)? Any detail you provide might help.

How to wait for a cloned child process of an invoked process to exit?

I have a program which needs to invoke a process to perform an operation and wait for it to complete the operation. The problem is that the invoked process clones itself and exits, which causes the wait api to return when the process exits. How can I wait for the cloned process to finish execution and return?
I am using the windows JOB object as mentioned in http://www.microsoft.com/msj/0399/jobkernelobj/jobkernelobj.aspx, But I am not sure if this is the best way.

umm, I'm pretty sure you can can the spawner process id from any process. I'd iterate through all the processes, find the one's who's parent id matches the one of the process you spawned, and wait for it to die.
alternatively (I mean, thats pretty hack) what is the child child process doing? is there some other way you could detect when it has finished doing what it is meant to do?
a hack way to get a process's parent id
http://www.codeguru.com/cpp/w-p/win32/article.php/c1437
takes a handle, and using the method in the code above, returns the parent id.
http://msdn.microsoft.com/en-us/library/ms684280(VS.85).aspx
OpenProcess takes an id, gets a handle to it (if you're lucky)
http://msdn.microsoft.com/en-us/library/ms684320(VS.85).aspx
GetProcessId takes a handle, gets it's id.
http://msdn.microsoft.com/en-us/library/ms683215(VS.85).aspx
GetExitCodeProcess takes a handle, returns whether the process is done or not.
http://msdn.microsoft.com/en-us/library/ms683189(VS.85).aspx
so appart from using hidden nt calls that it expressly tells you not to, you would basically have to create your process, get it's id, then spam all the process, opening them and checking their parent ids against the id of the process you created, if you didn't find one, then it's done, if you do, spam it with GetExitCodeProcess until its done.
I haven't tested any of this, but it looks like A way to do it. though if it's THE BEST way to do it I might just have to loose all faith in windows...

+1 for using job objects ;)
Assuming the process that you're running isn't spawning the cloned version of itself in such a way that it breaks out of the job...
You should be able to simply monitor the job events and act on JOB_OBJECT_MSG_ACTIVE_PROCESS_ZERO (see JOBOBJECT_ASSOCIATE_COMPLETION_PORT and SetInformationJobObject()). Monitoring the job in this way will also give you notifications of the processId's of new processes created within the job and details of when they exit.

If you have control over the source of invoked process, one possible solution would be to make it wait for the process it spawns by cloning itself.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js