This is what I need to do -
1. Define a handle threadHandle and define an array of handles h[20]...where each entry in the array has value threadHandle.
My code opens up 20 threads and once done, each thread has to signal the main thread and once all of them signal, the main thread has to log something to a log file.
I plan to do something like:
define threadHandle and the array of handles h[20} defined above.
Obnce the code opens 20 threads, do - waitformultipleobjects(NULL,20,h,true,10000)
Now the code will wait for the all the handles in the h array to be set before the wait returns. But since all the values of h are the same, the wait function returns an error. Is there a way to go around this? I basically need all the threads to signal back to the calling thread...defining 20 handlers for each of the 20 threads doesnt seem to be a good idea either.
Can I do something like this instead? -
define threadHandle and the array of handles h[20].
Maintain a variable count for the number of threads that signaled back to the main thread.
waitforsingleobject(threadHandle)
once this returns, increment count and if count < 20 repeat the above wait statement.
Keep doing it till count = 20 and then log to file.
Of course, in between if any of the waits timesout, then we log a failure to the log file.
I am trying this out, but was wondering if there is a better way to do this.
TIA.
anand
Create 20 Event objects. Put their handles into h. Pass one to each thread you create. When the thread needs to signal the parent, it signals that event. The parent waits on the Event handles, and when they're all signaled, it writes to the log.
Use one semaphore instead, (See CreateSemaphore() API, count initialized to 0), that all the threads signal. WaitForSingleObject in a for loop, counting up to 20. Much easier to set up, cannot miss any events and will work for any number of threads, (within reason).
Maintain a variable count for the number of threads. Yes you can do that.
nCount The number of object handles in the array pointed to by lpHandles. The maximum number of object handles is MAXIMUM_WAIT_OBJECTS. This parameter cannot be zero.
nCount is not specifying the array size but the number of handles to wait for. However, it shall not exceed the array size and both shall not exceed MAXIMUM_WAIT_OBJECTS.
Hint: This is not limited to specific handles like thread handles. You can handle a mixture of various waitable handles this way. (MSDN WaitForMultipleObjects function)
But since all the values of h are the same... No, they are not the same and the return value of WaitForMultipleObjects will vary accordingly (WAIT_OBJECT_0 + nCount).
Related
I am writing a C++ program in Qt that has an OnReceive(int value) event. It captures and push_back integer values into the std::vector. On another worker thread I have access to this vector and I can set a semaphore to wait for 20 values and then I can process them.
I want to do some optimization.
My question is how can I segment my buffer or vector into 3 parts of 0-4, 5-10, 11-19 so for example, as soon as 5 values are available in the vector (e.g 0 to 4), the second worker start to process them while the first thread still continue to get the rest of values?
by this way I wanna have an overlap between my threads. so they don't need to be run in serial.
Thank you.
Use a wait-free ring buffer.
Boost claims to have one
Note it is in the lock free folder but all methods claim to be thread safe and wait-free.
I'm trying to implement a gather function that waits for N processes to continue.
struct sembuf operations[2];
operaciones[0].sem_num = 0;
operaciones[0].sem_op = -1; // wait() or p()
operaciones[1].sem_num = 0;
operaciones[1].sem_op = 0; // wait until it becomes 0
semop ( this->id,operations,2 );
Initially, the value of the semaphore is N.
The problem is that it freezes even when all processes have executed the semop function. I think it is related to the fact that the operations are executed atomically (but I don't know exactly what it means). But I don't understand why it doesn't work.
Does the code subtract 1 from the semaphore and then block the process if it's not the last or is the code supposed to act in a different way?
It's hard to see what the code does without the whole function and algorithm.
By the looks of it, you apply 2 action in a single atomic action: subtract 1 from the semaphore and wait for 0.
There could be several issues if all processes freeze; the semaphore is not a shared between all processes, you got the number of processes wrong when initiating the semaphore or one process leaves the barrier, at a later point increases the semaphore and returns to the barrier.
I suggest debugging to see that all processes are actually in barrier, and maybe even printing each time you do any action on the semaphore (preferably on the same console).
As for what is an atomic action is; it is a single or sequence of operation that guarantied not to be interrupted while being executed. This means no other process/thread will interfere the action.
i have a question regarding cascading MPI_Bcast calls, i want to know if there is anything to keep in mind when wanting to distribute different data blocks from changing send threads to all other threads right after each other.
Imagine like this:
double buff=12345; // value is not important for this example
for (i=0; i<nthreads; i++) {// loop over all threads
MPI_BCast(&buff, 1, MPI_DOUBLE, i, MPI_COMM_WORLD); // the i-th threads sends all other receive
// some, but not all threads do something with the data
if (threadid > i) {
// do something that can need much more time on some threads than on others
}
}
I hope this example code explains the situation. Basically there is a for loop over all threads and in each iteration a different sending thread is used, the possible problem here is, that each thread take a different amount of time to get to MPI_Bcast again. Is it possible to have the sending thread allready there while some receiving threads are maybe still receiving from the last sender?
Do i need a MPI_Barrier here or can i cascade as much Bcasts as i want as long it is clear that each call is reached by every thread once?
edit: And what is when there is no loop over all threads but some other way to interate through a list of sending threads, so that it is possible to have the same sendign thread multiple times in a row?
Is it possible then to mix something, maybe the receiving threads only waiting for informations from the thread with id i do not differentiate between the first and second or one of the following Bcasts from that thread?
I found a bug in my program, that the same thread is awoke twice taking the opportunity for another thread to run, thus causing unintended behaviours. It is required in my program that all threads waiting should run exactly once per turn. This bug happens because I use semaphores to make the threads wait. With a semaphore initialized with count 0, every thread calls down to the semaphore at the start of its infinite loop, and the main thread calls up in a for loop NThreads (the number of threads) times. Occasionally the same thread takes the up call twice and the problem arises.
What is the way to deal with this problem properly? Is using condition variables and broadcasting a way to do this? Will it guarantee that every thread is awoke once and only once? What are other good ways possible?
On windows, you could use WaitForMultipleObjects to select a ready thread from the threads that have not been run in the current Nthread iterations.
Each thread should have a "ready" event to signal when it is ready, and a "wake" event to wait on after it has signaled its "ready" event.
At the start of your main thread loop (1st of NThreads iteration), call WaitForMultipleObjects with an array of your NThreads "ready" events.
Then set the "wake" event of the thread corresonding to the "ready" event returned by WaitForMultipleObjects, and remove it from the array of "ready" handles. That will guaranty that the thread that has already been run won't be returned by WaitForMultipleObjects on the next iteration.
Repeat until the last iteration, where you will call WaitForMultipleObjects with an array of only 1 thread handle (I think this will work as if you called WaitForSingleObject).
Then repopulate the array of NThreads "ready" events for the next new Nthreads iterations.
Well, use an array of semaphores, one for each thread. If you want the array of threads to run once only, send one unit to each semaphore. If you want the threads to all run exactly N times, send N units to each semaphore.
Here is the problem:
I have two sparse matrices described as vector of triplets.
The task is to write multiplication function for them using parallel processing with Win 32 API. So I need to know how do I:
1) Create a thread in Win 32 API
2) Pass input parameters for it
3) Get return value.
Thanks in advance!
Edit: "Process" changed for "Thread"
Well, the answer to your question is CreateProcess and GetExitCodeProcess.
But the solution to your problem isn't another process at all, it's more threads. And probably OpenMP is a much more suitable mechanism than creating your own threads.
If you have to use the Win32 API directly for threads, the process is something like:
Build a work item descriptor by allocating some memory, storing pointers to the real data, indexes for what this thread is going to work on, etc. Use a structure to keep this organized.
Call CreateThread and pass the address of the work item descriptor.
In your thread procedure, cast the pointer back to a structure pointer, access your work item descriptor, and process the data.
In your main thread, call WaitForMultipleObjects to join with the worker threads.
For even greater efficiency, you can use the Windows thread pool and call QueueUserWorkItem. But while you won't have to create threads yourself, you'd then need event handles to join tasks back to the main thread. It's about the same amount of code I suspect.