Suppose I call a program with OMP_NUM_THREADS=16.
The first function calls #pragma omp parallel for num_threads(16).
The second function calls #pragma omp parallel for num_threads(2).
The third function calls #pragma omp parallel for num_threads(16).
Debugging with gdb shows me that on the second call 14 threads exit. And on the third call, 14 new threads are spawned.
Is it possible to prevent 14 threads from exiting on the second call? Thank you.
The proof listings are below.
$ cat a.cpp
#include <omp.h>
void func(int thr) {
int count = 0;
#pragma omp parallel for num_threads(thr)
for(int i = 0; i < 10000000; ++i) {
count += i;
}
}
int main() {
func(16);
func(2);
func(16);
return 0;
}
$ g++ -o a a.cpp -fopenmp -g
$ ldd a
...
libgomp.so.1 => ... gcc-9.3.0/lib64/libgomp.so.1
...
$ OMP_NUM_THREADS=16 gdb a
...
Breakpoint 1, main () at a.cpp:13
13 func(16);
(gdb) n
[New Thread 0xffffbe24f160 (LWP 27216)]
[New Thread 0xffffbda3f160 (LWP 27217)]
[New Thread 0xffffbd22f160 (LWP 27218)]
[New Thread 0xffffbca1f160 (LWP 27219)]
[New Thread 0xffffbc20f160 (LWP 27220)]
[New Thread 0xffffbb9ff160 (LWP 27221)]
[New Thread 0xffffbb1ef160 (LWP 27222)]
[New Thread 0xffffba9df160 (LWP 27223)]
[New Thread 0xffffba1cf160 (LWP 27224)]
[New Thread 0xffffb99bf160 (LWP 27225)]
[New Thread 0xffffb91af160 (LWP 27226)]
[New Thread 0xffffb899f160 (LWP 27227)]
[New Thread 0xffffb818f160 (LWP 27228)]
[New Thread 0xffffb797f160 (LWP 27229)]
[New Thread 0xffffb716f160 (LWP 27230)]
15 func(2);
(gdb)
[Thread 0xffffba9df160 (LWP 27223) exited]
[Thread 0xffffb716f160 (LWP 27230) exited]
[Thread 0xffffbca1f160 (LWP 27219) exited]
[Thread 0xffffb797f160 (LWP 27229) exited]
[Thread 0xffffb818f160 (LWP 27228) exited]
[Thread 0xffffbd22f160 (LWP 27218) exited]
[Thread 0xffffb899f160 (LWP 27227) exited]
[Thread 0xffffbda3f160 (LWP 27217) exited]
[Thread 0xffffbb1ef160 (LWP 27222) exited]
[Thread 0xffffb91af160 (LWP 27226) exited]
[Thread 0xffffba1cf160 (LWP 27224) exited]
[Thread 0xffffb99bf160 (LWP 27225) exited]
[Thread 0xffffbb9ff160 (LWP 27221) exited]
[Thread 0xffffbc20f160 (LWP 27220) exited]
17 func(16);
(gdb)
[New Thread 0xffffbb9ff160 (LWP 27231)]
[New Thread 0xffffbc20f160 (LWP 27232)]
[New Thread 0xffffb99bf160 (LWP 27233)]
[New Thread 0xffffba1cf160 (LWP 27234)]
[New Thread 0xffffbda3f160 (LWP 27235)]
[New Thread 0xffffbd22f160 (LWP 27236)]
[New Thread 0xffffbca1f160 (LWP 27237)]
[New Thread 0xffffbb1ef160 (LWP 27238)]
[New Thread 0xffffba9df160 (LWP 27239)]
[New Thread 0xffffb91af160 (LWP 27240)]
[New Thread 0xffffb899f160 (LWP 27241)]
[New Thread 0xffffb818f160 (LWP 27242)]
[New Thread 0xffffb797f160 (LWP 27243)]
[New Thread 0xffffb716f160 (LWP 27244)]
19 return 0;
The simple answer is that it isn't possible with GCC to force the runtime to keep the threads around. From cursory reading the source code of libgomp, there are no ICVs, portable or vendor-specific, that prevent the termination of excess idle threads in consecutive regions. (someone correct me if I'm wrong)
If you really need to rely on the unportable requirement that the OpenMP runtime uses persistent threads across regions with varying team sizes in between, then use Clang or Intel C++ instead of GCC. Clang's (actually LLVM's) OpenMP runtime is based on the open-source version of Intel's and they both behave the way you want. Again, this is not portable and the behaviour may change in future versions. It is instead advisable to not write your code in such a way that its performance depends on the particularities of the OpenMP implementation. For example, if the loop takes several orders of magnitude more time than the creation of a thread team (which is on the order of tens of microseconds on modern systems), it won't really matter whether the runtime uses persistent threads or not.
If OpenMP overhead is really a problem, e.g., if the work done in the loop is not enough to amortise the overhead, a portable solution is to lift the parallel region and then either re-implement the for worksharing construct like in #dreamcrash's answer or (ab)use OpenMP's loop scheduling by setting a chunk size that will only result in the desired number of threads working on the problem:
#include <omp.h>
void func(int thr) {
static int count;
const int N = 10000000;
int rem = N % thr;
int chunk_size = N / thr;
#pragma omp single
count = 0;
#pragma omp for schedule(static,chunk_size) reduction(+:count)
for(int i = 0; i < N-rem; ++i) {
count += i;
}
if (rem > 0) {
#pragma omp for schedule(static,1) reduction(+:count)
for(int i = N-rem; i < N; ++i) {
count += i;
}
}
#pragma omp barrier
}
int main() {
int nthreads = max of {16, 2, other values of thr};
#pragma omp parallel num_threads(nthreads)
{
func(16);
func(2);
func(16);
}
return 0;
}
You need chunks of exactly equal sizes in all threads. The second loop is there to take care of the case when thr does not divide the number of iterations. Also, one cannot simply sum across private variables, hence count has to be shared, e.g., by making it static. This is ugly and drags along a bunch of synchronisation necessities that may have overhead comparable with spawning new threads and make the entire exercise pointless.
One approach would be to create a single parallel region, filter out the threads that will be executing the for, and manually distribute the loop iterations per thread. For simplicity sake, I will assume a parallel for schedule(static, 1):
include <omp.h>
void func(int total_threads) {
int count = 0;
int thread_id = omp_get_thread_num();
if (thread_id < total_threads)
{
for(int i = thread_id; i < 10000000; i += total_threads) {
count += i;
}
#pragma omp barrier
}
int main() {
...
#pragma omp parallel num_threads(max_threads_to_be_used)
{
func(16);
func(2);
func(16);
}
return 0;
}
Bear in mind that there is a race condition count += i; that would have to be fixed. In the original code, you could easily fix it by using the reduction clause, namely #pragma omp parallel for num_threads(thr) reduction(sum:count). In the code with the manual for you could solved it as follows:
#include <omp.h>
#include<stdio.h>
#include <stdlib.h>
int func(int total_threads) {
int count = 0;
int thread_id = omp_get_thread_num();
if (thread_id < total_threads)
{
for(int i = thread_id; i < 10000000; i += total_threads)
count += i;
}
return count;
}
int main() {
int max_threads_to_be_used = // the max that you want;
int* count_array = malloc(max_threads_to_be_used * sizeof(int));
#pragma omp parallel num_threads(max_threads_to_be_used)
{
int count = func(16);
count += func(2);
count += func(16);
count_array[omp_get_thread_num()] = count;
}
int count = 0;
for(int i = 0; i < max_threads_to_be_used; i++)
count += count_array[i];
printf("Count = %d\n", count);
return 0;
}
I would say that most of the time, one will have the same number of thread used in each parallel region. So such type of pattern should not be much of an occurrent issue.
Related
i use multi thread to update each item(string) of global vector
each thread update item(string) with different index
i think is a good way to avoid updating same data
but i still get core, i do not know why
extern vector<string> gTestVec;
#define NUM 10
void * worker(void * args) {
thread_data * p = (thread_data *)args;
int i = p->thread_id;
for (int j=0; j<100; j++) {
gTestVec[i] += "a";
}
return NULL;
}
void do_complete_stage_test::excute() {
int i = 0;
pthread_t thd[NUM];
thread_data data[NUM];
for (i=0; i<NUM; i++) {
gTestVec.push_back(format("%d", i));
data[i].thread_id = i;
if (0 != pthread_create(&(thd[i]), NULL, &worker, (void *)&data[i])) {
printf("pthread_create failed");
}
}
for (int i=0; i<NUM; i++) {
if (0 != pthread_join(thd[i], NULL)) {
printf("pthread_join failed");
}
}
}
when i run the code,sometimes get coredump
Starting program: /data/settle_script/isp_tran_collect/bin/isp_tran_collect -p 2134234
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x7ffff2623700 (LWP 6316)]
[New Thread 0x7fffefb0e700 (LWP 6317)]
[New Thread 0x7fffef30d700 (LWP 6318)]
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffef30d700 (LWP 6318)]
0x00007ffff6787d67 in ?? () from /lib64/libstdc++.so.6
(gdb) bt
#0 0x00007ffff6787d67 in ?? () from /lib64/libstdc++.so.6
#1 0x00007ffff678899b in std::string::reserve(unsigned long) () from /lib64/libstdc++.so.6
#2 0x00007ffff6788bbf in std::string::append(char const*, unsigned long) () from /lib64/libstdc++.so.6
#3 0x000000000044babe in append (__s=0x880492 "a", this=<optimized out>) at /usr/include/c++/4.8.2/bits/basic_string.h:1009
#4 operator+= (__s=0x880492 "a", this=<optimized out>) at /usr/include/c++/4.8.2/bits/basic_string.h:942
#5 worker (args=<optimized out>) at ../src/do_complete_stage_test.cpp:21
#6 0x00007ffff7bc6e25 in start_thread () from /lib64/libpthread.so.0
#7 0x00007ffff5ee635d in clone () from /lib64/libc.so.6
thanks for your help!!!
You are potentially changing the capacity of the vector after you already started some threads.
The easiest way to prevent the vector from re-allocating and moving its contents is to reserve the amount of space before you start the first worker thread.
So call
gTestVec.reserve(NUM);
Before your loop.
I am using Visual Studio 17 v15.0 and Win 10 Anniversary Update SDK.
I build the following code (basically sample in github repo) with cl /EHsc /O2 /DUNICODE /bigobj /await /std:c++latest, with /MT or MD. It compiles without error.
If I run when `"message.png" is not present in current directory, exception will be thrown, caught and reported with printf, then exit without crashing.
If I run when `"message.png" is present in current directory, "Hello World!" will be printed, then crash for no reason.
Weird thing is If I run it inside GDB debugger, GDB always say the program exits normally (and indeed no crash happen).
GDB output:
[New Thread 1364.0x2324]
[New Thread 1364.0x624]
[New Thread 1364.0x12cc]
[New Thread 1364.0x58c]
[New Thread 1364.0x1134]
[New Thread 1364.0x10d8]
[New Thread 1364.0x18a8]
[New Thread 1364.0x1794]
[New Thread 1364.0x20e8]
[New Thread 1364.0x2204]
[New Thread 1364.0x1030]
[New Thread 1364.0x1474]
Hello world!
[Thread 1364.0x10d8 exited with code 0]
[Thread 1364.0x624 exited with code 0]
[Thread 1364.0x20e8 exited with code 0]
[Thread 1364.0x1794 exited with code 0]
[Thread 1364.0x18a8 exited with code 0]
[Thread 1364.0x58c exited with code 0]
[Thread 1364.0x1134 exited with code 0]
[Thread 1364.0x12cc exited with code 0]
[Thread 1364.0x8d0 exited with code 0]
[Thread 1364.0x2324 exited with code 0]
[Thread 1364.0x1b38 exited with code 0]
[Thread 1364.0x2204 exited with code 0]
[Thread 1364.0x1030 exited with code 0]
[Thread 1364.0x1474 exited with code 0]
[Inferior 1 (process 1364) exited normally]
Code:
#pragma comment(lib, "windowsapp")
#pragma comment(lib, "pathcch")
#include <winrt/Windows.Storage.Streams.h>
#include <winrt/Windows.Graphics.Imaging.h>
#include <winrt/Windows.Media.Ocr.h>
#include <winrt/Windows.Networking.Sockets.h>
#include <pathcch.h>
using namespace winrt;
using namespace std::chrono;
using namespace Windows::Foundation;
using namespace Windows::Storage;
using namespace Windows::Storage::Streams;
using namespace Windows::Graphics::Imaging;
using namespace Windows::Media::Ocr;
hstring MessagePath()
{
wchar_t buffer[1024]{};
GetCurrentDirectory(_countof(buffer), buffer);
check_hresult(PathCchAppendEx(buffer, _countof(buffer), L"message.png", PATHCCH_ALLOW_LONG_PATHS));
return buffer;
}
IAsyncOperation<hstring> AsyncSample()
{
StorageFile file = co_await StorageFile::GetFileFromPathAsync(MessagePath());
IRandomAccessStream stream = co_await file.OpenAsync(FileAccessMode::Read);
BitmapDecoder decoder = co_await BitmapDecoder::CreateAsync(stream);
SoftwareBitmap bitmap = co_await decoder.GetSoftwareBitmapAsync();
OcrEngine engine = OcrEngine::TryCreateFromUserProfileLanguages();
OcrResult result = co_await engine.RecognizeAsync(bitmap);
return result.Text();
}
int main()
{
init_apartment();
try
{
printf("%ls\n", AsyncSample().get().c_str());
}
catch (hresult_error const & e)
{
printf("hresult_error: (0x%8X) %ls\n", e.code(), e.message().c_str());
}
return 0;
}
Turns out hstring returned by AsyncSample().get() is not null terminated, so printf crashes.
try
{
auto ans = AsyncSample().get();
printf("[%u]: ", ans.size());
auto s = ans.c_str();
for (uint32_t i = 0; i < ans.size(); i++) {
printf("%lc", s[i]);
}
putchar('\n');
}
In some of the answers to related questions I could see that gdb 7.3 should support displaying thread names atleast with 'info threads' command .
But I am not even getting that luxury. please help me to understand what I am doing wrong.
My sample code used for testing:
#include <stdio.h>
#include <pthread.h>
#include <sys/prctl.h>
static pthread_t ta, tb;
void *
fx (void *param)
{
int i = 0;
prctl (PR_SET_NAME, "Mythread1", 0, 0, 0);
while (i < 1000)
{
i++;
printf ("T1%d ", i);
}
}
void *
fy (void *param)
{
int i = 0;
prctl (PR_SET_NAME, "Mythread2", 0, 0, 0);
while (i < 100)
{
i++;
printf ("T2%d ", i);
}
sleep (10);
/* generating segmentation fault */
int *p;
p = NULL;
printf ("%d\n", *p);
}
int
main ()
{
pthread_create (&ta, NULL, fx, 0);
pthread_create (&tb, NULL, fy, 0);
void *retval;
pthread_join (ta, &retval);
pthread_join (tb, &retval);
return 0;
}
Output( using core dump generated by segmentation fault)
(gdb) core-file core.14001
[New LWP 14003]
[New LWP 14001]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/libthread_db.so.1".
Core was generated by `./thread_Ex'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x08048614 in fy (param=0x0) at thread_Ex.c:30
30 printf("%d\n",*p);
(gdb) info threads
Id Target Id Frame
2 Thread 0xb77d76c0 (LWP 14001) 0x00b95424 in __kernel_vsyscall ()
* 1 Thread 0xb6dd5b70 (LWP 14003) 0x08048614 in fy (param=0x0) at thread_Ex.c:30
(gdb) bt
#0 0x08048614 in fy (param=0x0) at thread_Ex.c:30
#1 0x006919e9 in start_thread () from /lib/libpthread.so.0
#2 0x005d3f3e in clone () from /lib/libc.so.6
(gdb) thread apply all bt
Thread 2 (Thread 0xb77d76c0 (LWP 14001)):
#0 0x00b95424 in __kernel_vsyscall ()
#1 0x006920ad in pthread_join () from /lib/libpthread.so.0
#2 0x080486a4 in main () at thread_Ex.c:50
Thread 1 (Thread 0xb6dd5b70 (LWP 14003)):
#0 0x08048614 in fy (param=0x0) at thread_Ex.c:30
#1 0x006919e9 in start_thread () from /lib/libpthread.so.0
#2 0x005d3f3e in clone () from /lib/libc.so.6
(gdb) q
As you can see I cant see any thread names that I have set. what could be wrong?
Note:
I am using gdb version 7.7 (Downloaded and compiled using no special options)
commands used to compile & install gdb : ./configure && make && make install
As far as I am aware, thread names are not present in the core dump.
If they are available somehow, please file a gdb bug.
I get thread name displayed on CentOS6.5, but not displayed on CentOS6.4 .
I am trying to implement a thread pool using ACE Semaphore library. It does not provide any API like sem_getvalue which is in Posix semaphore. I need to debug some flow which is not behaving as expected. Can I examine the semaphore in GDB. I am using Centos as OS.
I initialized two semaphores using the default constructor providing count 0 and 10. I have declared them as static in the class and initialized it in the cpp file as
DP_Semaphore ThreadPool::availableThreads(10);
DP_Semaphore ThreadPool::availableWork(0);
But when I am printing the semaphore in GDB using the print command, I am getting the similar output
(gdb) p this->availableWork
$7 = {
sema = {
semaphore_ = {
sema_ = 0x6fe5a0,
name_ = 0x0
},
removed_ = false
}
}
(gdb) p this->availableThreads
$8 = {
sema = {
semaphore_ = {
sema_ = 0x6fe570,
name_ = 0x0
},
removed_ = false
}
}
Is there a tool which can help me here, or shall I switch to Posix thread and re-write all my code.
EDIT: As requested by #timrau the output of call this->availableWork->dump()
(gdb) p this->availableWork.dump()
[Switching to Thread 0x2aaaae97e940 (LWP 28609)]
The program stopped in another thread while making a function call from GDB.
Evaluation of the expression containing the function
(DP_Semaphore::dump()) will be abandoned.
When the function is done executing, GDB will silently stop.
(gdb) call this->availableWork.dump()
[Switching to Thread 0x2aaaaf37f940 (LWP 28612)]
The program stopped in another thread while making a function call from GDB.
Evaluation of the expression containing the function
(DP_Semaphore::dump()) will be abandoned.
When the function is done executing, GDB will silently stop.
(gdb) info threads
[New Thread 0x2aaaafd80940 (LWP 28613)]
6 Thread 0x2aaaafd80940 (LWP 28613) 0x00002aaaac10a61e in __lll_lock_wait_private ()
from /lib64/libpthread.so.0
* 5 Thread 0x2aaaaf37f940 (LWP 28612) ThreadPool::fetchWork (this=0x78fef0, worker=0x2aaaaf37f038)
at ../../CallManager/src/DP_CallControlTask.cpp:1043
4 Thread 0x2aaaae97e940 (LWP 28609) DP_Semaphore::dump (this=0x6e1460) at ../../Common/src/DP_Semaphore.cpp:21
2 Thread 0x2aaaad57c940 (LWP 28607) 0x00002aaaabe01ff3 in __find_specmb () from /lib64/libc.so.6
1 Thread 0x2aaaacb7b070 (LWP 28604) 0x00002aaaac1027c0 in __nptl_create_event () from /lib64/libpthread.so.0
(gdb)
sema.semaphore_.sema_ in your code looks like a pointer. Try to find it's type in the ACE headers, then convert it to a type and print:
(gdb) p *((sem_t)0x6fe570)
Update: try to convert the address within the structure you posted to sem_t. If you use linux, ACE should be using posix semaphores, so type sem_t must be visible to gdb.
I am writing a ThreadPool Class in C++ using Boost ASIO. The following is the code that I have written so far:
The ThreadPool Class
using namespace std;
using namespace boost;
class ThreadPoolClass {
private:
/* The limit to the maximum number of threads to be
* instantiated within this pool
*/
int maxThreads;
/* Group of threads in the Pool */
thread_group threadPool;
asio::io_service asyncIOService;
void _Init()
{
maxThreads = 0;
}
public:
ThreadPoolClass();
ThreadPoolClass(int maxNumThreads);
ThreadPoolClass(const ThreadPoolClass& orig);
void CreateThreadPool();
void RunTask(JobClass * aJob);
virtual ~ThreadPoolClass();
};
ThreadPoolClass::ThreadPoolClass() {
_Init();
}
ThreadPoolClass::ThreadPoolClass(int maxNumThreads) {
_Init();
maxThreads = maxNumThreads;
}
void ThreadPoolClass::CreateThreadPool() {
asio::io_service::work work(asyncIOService);
for (int i = 0; i < maxThreads; i++) {
cout<<"Pushed"<<endl;
threadPool.create_thread(bind(&asio::io_service::run, &asyncIOService));
}
}
void ThreadPoolClass::RunTask(JobClass * aJob) {
cout<<"RunTask"<<endl;
asyncIOService.post(bind(&JobClass::Run,aJob));
}
ThreadPoolClass::ThreadPoolClass(const ThreadPoolClass& orig) {
}
ThreadPoolClass::~ThreadPoolClass() {
cout<<"Kill ye all"<<endl;
asyncIOService.stop();
threadPool.join_all();
}
The Job Class
using namespace std;
class JobClass {
private:
int a;
int b;
int c;
public:
JobClass() {
//Empty Constructor
}
JobClass(int val) {
a = val;
b = val - 1;
c = val + 1;
}
void Run()
{
cout<<"a: "<<a<<endl;
cout<<"b: "<<b<<endl;
cout<<"c: "<<c<<endl;
}
};
Main
using namespace std;
int main(int argc, char** argv) {
ThreadPoolClass ccThrPool(20);
ccThrPool.CreateThreadPool();
JobClass ccJob(10);
cout << "Starting..." << endl;
while(1)
{
ccThrPool.RunTask(&ccJob);
}
return 0;
}
So, basically I am creating 20 threads, but as of now just posting only one (same) task to be run by ioservice (just to keep things simple here and get to the root cause). The following is the output when I run this program in GDB:
Pushed
[New Thread 0xb7cd2b40 (LWP 15809)]
Pushed
[New Thread 0xb74d1b40 (LWP 15810)]
Pushed
[New Thread 0xb68ffb40 (LWP 15811)]
Pushed
[New Thread 0xb60feb40 (LWP 15812)]
Pushed
[New Thread 0xb56fdb40 (LWP 15813)]
Pushed
[New Thread 0xb4efcb40 (LWP 15814)]
Pushed
[New Thread 0xb44ffb40 (LWP 15815)]
Pushed
[New Thread 0xb3affb40 (LWP 15816)]
Pushed
[New Thread 0xb30ffb40 (LWP 15817)]
Pushed
[New Thread 0xb28feb40 (LWP 15818)]
Pushed
[New Thread 0xb20fdb40 (LWP 15819)]
Pushed
[New Thread 0xb18fcb40 (LWP 15820)]
Pushed
[New Thread 0xb10fbb40 (LWP 15821)]
Pushed
[New Thread 0xb08fab40 (LWP 15822)]
Pushed
[New Thread 0xb00f9b40 (LWP 15823)]
Pushed
[New Thread 0xaf8f8b40 (LWP 15824)]
Pushed
[New Thread 0xaf0f7b40 (LWP 15825)]
Pushed
[New Thread 0xae8f6b40 (LWP 15826)]
Pushed
[New Thread 0xae0f5b40 (LWP 15827)]
Pushed
[New Thread 0xad8f4b40 (LWP 15828)]
Starting...
RunTask
Kill ye all
[Thread 0xb4efcb40 (LWP 15814) exited]
[Thread 0xb30ffb40 (LWP 15817) exited]
[Thread 0xaf8f8b40 (LWP 15824) exited]
[Thread 0xae8f6b40 (LWP 15826) exited]
[Thread 0xae0f5b40 (LWP 15827) exited]
[Thread 0xaf0f7b40 (LWP 15825) exited]
[Thread 0xb56fdb40 (LWP 15813) exited]
[Thread 0xb18fcb40 (LWP 15820) exited]
[Thread 0xb10fbb40 (LWP 15821) exited]
[Thread 0xb20fdb40 (LWP 15819) exited]
[Thread 0xad8f4b40 (LWP 15828) exited]
[Thread 0xb3affb40 (LWP 15816) exited]
[Thread 0xb7cd2b40 (LWP 15809) exited]
[Thread 0xb60feb40 (LWP 15812) exited]
[Thread 0xb08fab40 (LWP 15822) exited]
[Thread 0xb68ffb40 (LWP 15811) exited]
[Thread 0xb74d1b40 (LWP 15810) exited]
[Thread 0xb28feb40 (LWP 15818) exited]
[Thread 0xb00f9b40 (LWP 15823) exited]
[Thread 0xb44ffb40 (LWP 15815) exited]
[Inferior 1 (process 15808) exited normally]
I have two questions:
Why is it so that my threads are exiting, even when I am posting
tasks in a while loop?
Why is the output from JobClass i.e. the values of the variables a,b
and c not getting printed?
I think this happens because you create work object in the CreateThreadPool method, which is automatically destroyed when goes out of scope -> in this case io_service has no active work and does not process your tasks.
Try to make 'work' instance variable of your ThreadPool class, not local one in the method.
class ThreadPoolClass {
private:
thread_group threadPool;
asio::io_service asyncIOService;
std::auto_ptr<asio::io_service::work> work_;
public:
};
ThreadPoolClass::ThreadPoolClass(int maxNumThreads) {
_Init();
maxThreads = maxNumThreads;
}
void ThreadPoolClass::CreateThreadPool() {
work_.reset(new asio::io_service::work(asyncIOService));
for (int i = 0; i < maxThreads; i++) {
cout<<"Pushed"<<endl;
threadPool.create_thread(bind(&asio::io_service::run, &asyncIOService));
}
}
OK, i'll be the first to admit I don't know boost, and more specifically boost::asio from a hole in the ground, but I know a hella-lot about thread pools and work crews.
The threads a supposed to sleep until notified of new work, but if they are not configured to do so they will likely just finish their thread proc and exit, A tell-tale sign that this is the case is to start up a pool, sleep for a reasonable amount of time before posting any work, and if the pool threads are all terminating, they're not properly waiting. A quick perusal of boost docs yielded this and it may be related to your problem.
On that note, is it possible that the destructor of your pool from the main() entry point is, in fact, prematurely killing your work crew? I see the join_all, but that stop() gives me the willies. if it does what its name implies that would explain a lot. According to the description of that stop() call from the docs:
To effect a shutdown, the application will then need to call the
io_service object's stop() member function. This will cause the
io_service run() call to return as soon as possible, abandoning
unfinished operations and without permitting ready handlers to be
dispatched.
That immediate shutdown and abandonment mention seems suspiciously familiar to your current situation.
Again, I don't know boost:asio from Adam, but were I on this I would check the startup configuration for the boost thread objects. they likely require configuration for how to start, how to wait, etc. There must be numerous samples of using boost:asio on the web concerning configuring the very thing you're describing here, namely a work crew paradigm. I see boost::asio a TON on SO, so there is likely many related or near-related questions as well.
Please feel free to downgrade this if it isn't anything useful, and I apologize if that is the case.