creating more than 1000 threads using pthread_create() - c++

I'm trying to create 1000 threads using the pthread_create() function.
This is the statement I'm using:
for (int i=0 ; i <1000; i++)
{
retValue = pthread_create(&threadId, NULL, simplethreadFunction, NULL);
}
Everytime this for-loop runs does it create a new thread?
This is a simple thing. But I'm unable to understand it.

Everytime this for-loop runs does it create a new thread?
Yes, it does.
This is a simple thing. But I'm unable to understand it.
I will add a few more points:
First parameter to the function pthread_create is pointer type to pthread_t. Basically you are passing an address to this function, which this function uses to assign 'something'.
When this function creates a thread, an 'opaque, unique identifier' for this thread is created and the pointer you passed is made to point to this location, so that you can access it later, if required.
If you will pass the same pointer all the 1000 times, you will have access to the unique identifier for only one (the last one) thread created out of all 1000, because each time the previous value will get over written.
This unique value is required if you would want to perform further operations on a thread (like joining etc).
For details about this function and other thread related functions you can go though this and this.
Don't forget to call pthread_exit in your main context, otherwise complete program (including the created threads) might terminate even before all your threads would have finished.
Also regarding the time, this thing might not have any effect on time of creation as far as I think, will just reduce the usability of threads you have created. Also, this time you are calculating is not THE time for creating 1000 threads, will depend on lot of other factors like platform/implementation etc.

Related

CreateThread's threadProc race condition

I need to spawn 4 threads, that basically do the same thing but with a different variable each. So I call ::CreateThread 4 times, giving the same threadProc and 'this' as a parameter. Now in threadProc, I need to pick the right variable to work with. I have a vector of objects, and I push into it the object immediately after each of the CreateThread call.
// at this point myVec has say, 2 items
HANDLE hThread = ::CreateThread( NULL, NULL, threadProc, (LPVOID)this, NULL, NULL );
myVecObj.threadHandle = hThread;
myVec.push_back(myVecObj); // myVec.Size = 3 now
DWORD CALLBACK myClass::threadProc(LPVOID lpContext)
{
myClass *pMyClass = (myClass *)lpContext;
int vecCount = pMyClass->myVec.size; // Is this 3??
char * whatINeed = (char*)pMyClass->myVec[vecCount-1].whatINeed;
}
My doubt/question is how fast does the threadProc fire - could it beat the call to myVec.push_back()? Is this a race condition that I'm introducing here? I'm trying to make the assumption that when each threadProc starts (they start at different times, not one after the other), I can safely take the last object in the class' vector.
I need to spawn 4 threads, that basically do the same thing but with a different variable each. So I call ::CreateThread 4 times, giving the same threadProc and this as a parameter.
Now in threadProc, I need to pick the right variable to work with.
Why not pass the thread a pointer to the actual object it needs to act on?
I have a vector of objects, and I push into it the object immediately after each of the CreateThread call.
That is the wrong way to handle this. And yes, that is a race condition. Not only for the obvious reason - the thread might start running before the object is pushed - but also because any push into a vector can potentially reallocate the internal array of the vector, which would be very bad for threads that have already obtained a pointer to their data inside the vector. The data would move around in memory behind the thread's back.
To solve this, you need to either:
push all of the objects into the vector first, then start your threads. You can pass a pointer to each vector element to its respective thread. But this works only if you don't modify the vector anymore while any thread is running, for the reason stated above.
start the threads in a suspended state first, and then resume them after you have pushed all the objects into the vector. This also requires that you don't modify the vector anymore. It also means you will have to pass each thread an index to a vector element, rather than passing it a pointer to the element.
get rid of the vector altogether (or at least change it to hold object pointers instead of actual objects). Dynamically allocate your objects using new and pass those pointers to each thread (and optionally to the vector) as needed. Let each thread delete its object before exiting (and optionally remove it from the vector, with proper synchronizing).
My doubt/question is how fast does the threadProc fire
That is entirely up to the OS scheduler to decide.
could it beat the call to myVec.push_back()?
Yes, that is a possibility.
Is this a race condition that I'm introducing here?
Yes.
I'm trying to make the assumption
Don't make assumptions!
that when each threadProc starts (they start at different times, not one after the other), I can safely take the last object in the class' vector.
That is not a safe assumption to make.
There is no synchronisation between the modification of myVec, i.e., the myVec.push_back() call, and reading the size of the object in another thread. I do realise that you don't use standard threads but applying the C++11 rules there is a data race and the program has undefined behaviour.
Note that the data race isn't just theoretical: there is a fair chance that you see the modification happen after the read. Creating a thread may not be fast but some implementations actually don't create OS level threads but rather keep a pool of threads around which are used when apparently spawning a new thread.
In similar contexts I heard the excellent argument "... but it only happens once in a million times!". This particular issue would have happened on the 48 core machine about 10 times per second, assuming the estimate "once in a million" were correct.

Will my pthread wait or will the main thread wait?

So i'm getting the hang of using c/c++ but i'm still a bit misguided. I'm also trying to learn synchronization at the same time so things aren't going perfect.
So my potential problem here is,
I have a Node object, Node has a method called run. Run creates a pthread and passes a function pointer of a function called compute() as a parameter.
The Compute function has one parameter which is the Node that called Run()
The Compute function will then access a Semaphore (sem_t) that is a field of the Node object passed as a parameter and will call sem_wait(Node.sem) on that semaphore.
If I do this, will the newly created thread that is running the compute function actually call the sem_wait and do the defined behavior. Or will the the process that originally created the Node call sem_wait?
The sem_wait call will execute in the thread in which it was called (as #Jason C points out in his comment). From what you've described that happens in run after the thread has been started, hence sem_wait will be executed in the first thread.
You seem to be thinking that because the Node object is used in both threads that somehow has an effect on which thread will execute a call. It doesn't. Threads share memory space so your Node object can be used in any thread within a process. That's when you start getting into thread safety issues.

Creating a thread pool in one function call and using it from another function call

I have a Fortran program that calls a C++ dll to do some mathematical operations on 10000 sets of data. The data sets are totally independent from each other. I was planning to create a thread pool and then send tasks to it. However, the call to the dll will be made more than 1000 times (each call the 10000 sets of data are being processed).
My question is: when I create the thread pool during the first call to the dll, what happens to this thread pool after the function in the dll returns ? Can the second call (and the remaining 998 calls) access the pool that was created during the first call.
You can indeed use the same thread pool, if you set things up right.
Objects created on the stack of the FORTRAN->C++ calling thread will be destroyed as that stack unwinds and control returns to FORTRAN, so it's not a good idea to have the thread pool management data on that stack. You can, however:
launch another thread that creates the thread pool management data/object, or
allocate on the heap (using new) to decouple lifetime from the FORTRAN->C++ calls.
The latter is probably easier and cleaner... a pointer to the heap object/data managing the thread pool can be returned to FORTRAN and used as a "handle" for future calls, indicating the same thread pool should be used.
If you have control over the fortran code, you can save yourself some sneaky hiding of your state you maintain by using 3 functions instead of one.
someStateHandle PrepareBackgroundWork();
// Then you do your actual call series...
DoMyMath(someStateHandle, args...);
// And when you are done with all that, you call
FinalizeBackgroundWork(someStateHandle);
If you do not have control over the fortran code, you will have to decide what you want to keep around (Threadpool stuff or thread handles and a few synchronization objects) and lazily initialize them.
struct MyWorkerContext
{
size_t numberOfWorkerThreads;
std::vector<HANDLE> workerHandles;
// ...
};
static MyWorkerContext* s_context = NULL; // Sorry - looks like a singleton to me.
void DoMyMath( args..)
{
if(NULL == s_context) InitializeContext();
if( NULL != s_context )
{
// do the calculations using all that infrastructure.
}
}
E.g. in DLLMain() or hopefully earlier: clean up s_context.
Last not least, I think there is a "default thread pool", you might be able to use for that as well instead of creating your own.

What happens to a thread in a vector when function execution ends?

I want to know more about std::thread, and specifically what will happen if I have a vector of threads, and one of the threads finishes executing.
Picture this example:
A vector of threads is created, which all execute the following function:
function_test(char* flag)
{
while(*flag == 1) { // Do Something
}
}
'char* flag' points to a flag signalling the function to stop execution.
Say, for example, the vector contains 10 threads, which are all executing. Then the flag is set to zero for thread number 3. (The 4th thread in the vector, as vector starts from zero.)
Good practice is to then join the thread.
vector_of_threads[3].join();
How many std::threads will the vector now contain? Can I re-start the finished thread with the same function again, or even a different function?
The reason for my question is that I have a vector of threads, and sometimes they will be required to stop executing, and then execution "falls off the end" of the function.
One solution to restart that thread would (I assume, perhaps incorrectly?) be to erase that element from the vector and then insert a new thread, which will then begin executing. Is this correct though, since when a thread stops, will it still be inside the vector? I assume it would be?
Edit
'function_test' is not allowed to modify any other functions flags. The flags are modified by their own function and the calling function. (For the purposes of this, imagine flag enables communication between main and the thread.)
Does this fix the data-race problem, or is it still an issue?
It's not specifically what you're asking about, but flag should be atomic<char>* or you have a data race, i.e. undefined behaviour. Also, if it only holds true or false I'd use atomic<bool>* and just test if (*flag).
As for your actual question:
How many std::threads will the vector now contain?
It will contains exactly the same number as it did previously, but one of them is no longer "joinable" because it doesn't represent a running thread. When the thread stops running it doesn't magically alter the vector to remove an element, it doesn't even know the vector exists! The only change visible in the main thread is that calling vector_of_threads[3].join() will not block and will return immediately, because the thread has already finished so you don't have to wait to join it.
You could erase the joined std::thread from the vector and insert a new one, but another alternative is to assign another std::thread to it that represents a new thread of execution:
vector_of_threads[3] = std::thread(f, &flags[3]);
Now vector_of_threads[3] represents a running thread and is "joinable" again.

CreateThread issue in c under window OS

I have the following code which initiate the thread.
int iNMHandleThread = 1;
HANDLE hNMHandle = 0;
hNMHandle = CreateThread( NULL, 0, NMHandle, &iNMHandleThread, 0, NULL);
if ( hNMHandle == NULL)
ExitProcess(iNMHandleThread);
My question is
What will happened if I run this code while the thread already in the running state.
I want to initiate the multiple independent threads of NMHandle kindly give me some hints to solve this problem.
Each time you call CreateThread, a new thread is started that is independent of any other currently-running threads. Whether your "NMHandle" function is capable of running on more than one thread at a time is up to you: for example, does it rely on any global state?
What will happened if I run this code while the thread already in the running state.
Another thread will start with the function NMHandle, independent of the other one.
I want to initiate the multiple independent threads of NMHandle kindly give me some hints to solve this problem.
This code actually creates an independent thread. Create a loop if you want to create multiple threads executing the function NMHandle. If you need the thread handles later (e.g. waiting for a thread to end), you have to store them somewhere.
Make sure that NMHandle is thread-safe. If you don't know what that means, you shouldn't start multithreaded programming yet!
And another hint: You're passing a pointer to the local stack variable iNMHandleThread to the thread. As soon as the function returns, the variable content might not have its expected value anymore - you should rather pass the number by value (CreateThread( NULL, 0, NMHandle, (void*)iNMHandleThread, 0, NULL);).
CreateThread creates a new thread. The new thread obviously can't be in the running state before - it doesn't have a state before it's created. Compare the simple statement int i = 42; - There's no prior value of i before 42, because the object doesn't exist yet. Obviously the old thread that calls CreateThread() must be running - otherwise it couldn't have run to the line that calls CreateThread() !
Every time you call CreateThread, you will get a new thread. You will also get a new thread handle and ID for every call. So you can't store them all in int iNMHandleThread or HANDLE hNMHandle. Consider a std::list<int> NmThreadIDs and std::list<HANDLE> NmThreadHandles;.
Furthermore, all new threads will start by calling NMHandle(). Is that function thread-safe? That is to say, will that function work properly when executed by two threads at the same time, or interleaved, or in any other random order? Mechanisms like mutexes and critical sections can be used to exclude some unsafe orders of execution.