Broadcast message for all processes to exit(MPI) - c++

[MPi-C++]
I made an application that under a specific condition it should close the application in all processes.
I tried to made it using root process but I want to send message to all other processes to terminate also. How can I make this???

There is no way to quit an MPI application cleanly on all processes without communication. That means, if you have a condition that occurs only on a subset of the processes of your MPI application (e.g. you have an error on one of processes), the only way to unilaterally quit the application is to call MPI_Abort. This will result in all MPI processes coming to an abrupt end, no matter where in the code each rank was at that moment. Since MPI_Abort is not a collective routine, it is not possible to perform any cleanup on any of the other ranks.
If you wish to have a clean exit, you need to regularly communicate between all ranks whether everything is still working on all ranks, or if it is time to quit. For example, you could regularly call MPI_Allreduce with MPI_SUM as the operation. If your exit condition is fulfilled on a process, make it send 1 as the data, otherwise make it send 0. Now you only need to check after the MPI_Allreduce if the sum is larger than 0, and if it is, quit your application in an orderly fashion.

Related

Using MPI, how can I synchronize the end of inter-dependent processes?

Ok, the title sounds confusing but the concept is not too bad. Basically, I have two processes that are running (let's call them process 0 and process 1). They both run a function at the same time. While this function is running, sometimes they need data from each other. So process 0 sometimes requests data from process 1 and vice versa. Since they rely on each other, I don't want one process to finish before the other. If process 0 finishes its work, it should continue checking for requests from process 1 (otherwise process 1 won't be able to finish). After both processes have finished their work, only then should they proceed.
I'm having trouble implementing this. Right now, I have each process send all other processes a notification when it finishes (so process 1 sends a notification to process 2 when its work is done). Then I have a loop that is supposed to continue until it receives a notification from all the other processes. Only then should the loop exit and the process continue. However, this isn't working. The processes keep going before the others have finished. I feel like there's probably a much simpler way to do this that I'm not thinking of.
I'm a complete newbie to MPI, so I hope I've explained this properly. Also, this needs to work for any number of processes, not just two. Thanks for your help!

Check if adjacent slave process is ended in MPI

In my MPI program, I want to send and receive information to adjacent processes. But if a process ends and doesn't send anything, its neighbors will wait forever. How can I resolve this issue? Here is what I am trying to do:
if (rank == 0) {
// don't do anything until all slaves are done
} else {
while (condition) {
// send info to rank-1 and rank+1
// if can receive info from rank-1, receive it, store received info locally
// if cannot receive info from rank-1, use locally stored info
// do the same for process rank+1
// MPI_Barrier(slaves); (wait for other slaves to finish this iteration)
}
}
I am going to check the boundaries of course. I won't check rank-1 when process number is 1 and I won't check rank+1 when process is the last one. But how can I achieve this? Should I wrap it with another while? I am confused.
I'd start by saying that MPI wasn't originally designed with your use case in mind. In general, MPI applications all start together and all end together. Not all applications fit into this model though, so don't lose hope!
There are two relatively easy ways of doing this and probably thousands of hard ones:
Use RMA to set flags on neighbors.
As has been pointed out in the comments, you can set up a tiny RMA window that exposes a single value to each neighbor. When a process is done working, it can do an MPI_Put on each neighbor to indicate that it's done and then MPI_Finalize. Before sending/receiving data to/from the neighbors, check to see if the flag is set.
Use a special tag when detecting shutdowns.
The tag value often gets ignored when sending and receiving messages, but this is a great time to use it. You can have two flags in your application. The first (we'll call it DATA) just indicates that this message contains data and you can process it as normal. The second (DONE) indicates that the process is done and is leaving the application. When receiving messages, you'll have to change the value for tag from whatever you're using to MPI_ANY_TAG. Then, when the message is received, check which tag it is. If it's DONE, then stop communicating with that process.
There's another problem with the pseudo-code that you posted however. If you expect to perform an MPI_Barrier at the end of every iteration, you can't have processes leaving early. When that happens, the MPI_Barrier will hang. There's not much you can do to avoid this unfortunately. However, given the code you posted, I'm not sure that the barrier is really necessary. It seems to me that the only inter-loop dependency is between neighboring processes. If that's the case, then the sends and receives will accomplish all of the necessary synchronization.
If you still need a way to track when all of the ranks are done, you can have each process alert a single rank (say rank 0) when it leaves. When rank 0 detects that everyone is done, it can just exit. Or, if you want to leave after some other number of processes is done, you can have rank 0 send out a message to all other ranks with a special tag like above (but add MPI_ANY_SOURCE so you can receive from rank 0).

Obtain thread handles/id of a specific process

I have a multi-threaded embedded architecture that contains 6 application specific processes which are executed when the initialization process is executed. Likewise, each have their own number of threads that are running.
What i want to do is suspend the running threads of 1 particular process based on whether the device is connected to the pc or not.
I have tried searching around and the closest i've found to what im looking for is the following: How to obtain list of thread handles from a win32 process?
However, that code returns the list of all running threads. This wont work for me since im trying to suspend all obtained threads, assuming they have been obtained from the same process, thus i do not check which process they belong too.
Likewise, i am obtaining the list of running threads of a processes in another process.
Is there an existing method from windows that allows such control, or am i stuck with having to identify which threads i need to suspend from the entire list?
Instead of trying to forcefully suspend threads (which is likely to bring you trouble when you suspend in "not so lucky moment") you'd rather use a named CreateEvent() with manual reset.
Named events are easily shared between processes. You simply CreateEvent() again with the same name. The typical name for event would be MyCompany_MyProduct_MyFeature_EventName to prevent accidental collisions.
When you WaitForSingleObject() on "set" event, the wait is immediately satisfied.
When you wait on "reset" event, the wait suspends your thread until event is set.
Your first application will have its thread(s) wait on event when they're not doing any work and therefore safe to suspend.
You will set and reset event from second application to control the first application.
This way, you don't need to enumerate threads, and it's more robust.

Is it possible to detect 'end process' externally?

Is there some way to detect that a program was ended by windows task manager's "end process"?
I know that its kinda impossible to do that from within the application being ended (other than to build your app as a driver and hook ZwTerminateProcess), but I wonder if there is a way to notice it from outside.
I don't want to stop the program from terminating, just to know that it was ended by "end process" (and not by any other way).
There might be a better way - but how about using a simple flag?
Naturally, you'd have to persist this flag somewhere outside of the process/program's memory - like the registry, database, or file system. Essentially, when the app starts up, you set the flag to 'True' when the app shuts down through the normal means, you set the flag to 'False'.
Each time the application starts you can check the flag to see if it was not shut down correctly the previous time it was executed.
Open up a handle to the process with OpenProcess, and then wait on that handle using one of the wait functions such as WaitForSingleObject. You can get the exit status of the process using GetExitCodeProcess. If you need your program to remain responsive to user input while waiting, then make sure to wait on a separate thread (or you can periodically poll using a timeout of zero, but remember the performance consequences of polling -- not recommended).
When you're done, don't forget to call CloseHandle. The process object won't be fully deleted from the OS until all of its handles are closed, so you'll leak resources if you forget to call CloseHandle.
Note that there's no way to distinguish between a process exiting normally or being terminated forcefully. Even if you have a convention that your program only ever exits with a status of 0 (success) or 1 (failure) normally, some other process could call TerminateProcess(YourProcess, 1), and that would be indistinguishable from your ordinary failure mode.
According to the documentation, ExitProcess calls the entry point of all loaded DLLs with DLL_PROCESS_DETACH, whereas TerminateProcess does not. (Exiting the main function results in a call to ExitProcess, as do most unhandled exceptions.)
You might also want to look into Application Recovery and Restart.
One option might be to create a "watchdog" application (installed as a service, perhaps) that monitors WMI events for stopping a process via the ManagementEventWatcher class (in the System.Management namespace).
You could query for the death of your process on an interval or come up with some event driven way to alert of your process's demise.
Here's sort of an example (it's in C# though) that could get you started.

Number of parallel instances of my process (app)

Is there some portable way to check the number of parallel instances of my app?
I have a c++ app (win32) where I need to know how often it was started. The problem is
that several user can start it parallel (terminal server), so i cannot search the "running process" list because I'm not able to access the the list of other users.
I tried it with Semaphore (boost & win32 CreateSemaphore)
It worked, but now I have the problem if the app crashes (Assertion or just kill the process) the counter is not changed. (rebooting helps)
Also manually removing/resetting the semaphore counter in my code is not possible because I don't know if somebody else is running my application.
Edited to add:
Suppose you have a license that lets you run 20 full-functionality copies of your program. Then you could have 20 mutexes, named MyProgMutex1 through MyProgMutex20. At startup, your program can loop through the mutexes. If it finds a spare mutex that it can take, it stops looping and enters full-functionality mode. If it loops through all the mutexes without being able to take any of them, then it enters reduced-functionality mode.
Original answer:
I assume you want to make sure that only one copy of your process runs at once. (Or, for Terminal Server, one copy of your process per login session).
Your named semaphore solution is close. The right way to do this is a named mutex. Use CreateMutex to make the mutex, then call WaitForSingleObject with a timeout of zero. If WaitForSingleObject returns WAIT_TIMEOUT, another copy of the process is running. If it returns WAIT_OBJECT_0 or WAIT_ABANDONED, then you are the only copy of the process. You need to keep the mutex handle open while your program runs - either call CloseHandle when your process is about to exit, or just deliberately leak the handle and rely on Window's built-in cleanup to release the handle for you when your process exits. Windows will automatically increment the mutex's counter when your process exits.
The only thing I can think of that mitigates the problem of crashed processes is a kind of “dead man’s switch”: each process needs to update its status in regular intervals. If a process fails to do this, it’s automatically discarded from the list of active processes.
This technique requires that one of the processes acts as a server which keeps tab of whether other processes have updated recently. If the server dies, then another process can take over. This, in turn, requires that each process tests whether there still is a server alive.
Alternatively, each process can be its own server and keep track locally. This may be easier to implement than server-switching.
You can broadcast message and other instances of your application should send some response. You count responses - you get number of instances.