Is it OK to call std::async at high frequency? - c++

I have a little program I wrote that uses std::async for parallelism, and it is crashing on me. I'm pretty sure that there are much better ways to do this, but for now I just want to know what is happening here. I'm not going to post the exact code since I do not think it really makes a difference. It basically looks something like this:
while(1)
{
std::vector<Things> things(256);
auto update_the_things = [&](int start, int end) { //some code };
auto handle1 = std::async(std::launch::async, update_the_things, 0, things.size() / 4);
auto handle2 = std::async(std::launch::async, update_the_things, things.size() / 4, things.size() / 4 * 2);
auto handle3 = std::async(std::launch::async, update_the_things, things.size() / 4 * 2, things.size() / 4 * 3);
update_the_things(things.size() / 4 * 3, things.size());
handle1.get();
handle2.get();
handle3.get();
}
This loop runs several thousand times per second and after a random amount of time (5 seconds - 1 minute) it crashes. If I look in task manager I see that the thread count for this program is rapidly fluctuating, which makes me think that std::async is launching new threads with each call. I would have thought it would work with a thread pool or something. In any case, is this crashing because I am doing something wrong?
Using GDB I get the following:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 3560.0x107c]
0x0000000000000000 in ?? ()
#0 0x0000000000000000 in ?? ()
#1 0x000000000041d18c in pthread_create_wrapper ()
#2 0x0000000000000000 in ?? ()
Output from gcc -v as requested:
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=c:/tdm-gcc-64/bin/../libexec/gcc/x86_64-w64-mingw32/4.8.1/lto-wrapper.exe
Target: x86_64-w64-mingw32
Configured with: ../../../src/gcc-4.8.1/configure --build=x86_64-w64-mingw32 --enable-targets=all --enable-languages=ada,c,c++,fortran,lto,objc,obj-c++ --enable-libgomp --enable-lto --enable-graphite --enable-cxx-flags=-DWINPTHREAD_STATIC --enable-libstdcxx-debug --enable-threads=posix --enable-version-specific-runtime-libs --enable-fully-dynamic-string --enable-libstdcxx-threads --enable-libstdcxx-time --with-gnu-ld --disable-werror --disable-nls --disable-win32-registry --prefix=/mingw64tdm --with-local-prefix=/mingw64tdm --with-pkgversion=tdm64-2 --with-bugurl=http://tdm-gcc.tdragon.net/bugs
Thread model: posix
gcc version 4.8.1 (tdm64-2)

This standard-conforming program also crashes, and usually much faster:
#include <iostream>
#include <future>
int main() {
try {
for (;;) {
std::async(std::launch::async, []{}).get();
}
} catch(...) { std::cout << "Something threw\n"; }
}
It's a bug in the implementation.

Related

`std::condition_var::notify_all` deadlocks

I have cpp code where one thread produces, pushing data into a queue and another consumes it before passing it to other libraries for processing.
std::mutex lock;
std::condition_variable new_data;
std::vector<uint8_t> pending_bytes;
bool data_done=false;
// producer
void add_bytes(size_t byte_count, const void *data)
{
if (byte_count == 0)
return;
std::lock_guard<std::mutex> guard(lock);
uint8_t *typed_data = (uint8_t *)data;
pending_bytes.insert(pending_bytes.end(), typed_data,
typed_data + byte_count);
new_data.notify_all();
}
void finish()
{
std::lock_guard<std::mutex> guard(lock);
data_done = true;
new_data.notify_all();
}
// consumer
Result *process(void)
{
data_processor = std::unique_ptr<Processor>(new Processor());
bool done = false;
while (!done)
{
std::unique_lock<std::mutex> guard(lock);
new_data.wait(guard, [&]() {return data_done || pending_bytes.size() > 0;});
size_t byte_count = pending_bytes.size();
std::vector<uint8_t> data_copy;
if (byte_count > 0)
{
data_copy = pending_bytes; // vector copies on assignment
pending_bytes.clear();
}
done = data_done;
guard.unlock();
if (byte_count > 0)
{
data_processor->process(byte_count, data_copy.data());
}
}
return data_processor->finish();
}
Where Processor is a rather involved class with a lot of multi-threaded processing, but as far as I can see it should be separated from the code above.
Now sometimes the code deadlocks, and I'm trying to figure out the race condition. My biggest clue is that the producer threads appears to be stuck under notify_all(). In GDB I get the following backtrace, showing that notify_all is waiting on something:
[Switching to thread 3 (Thread 0x7fffe8d4c700 (LWP 45177))]
#0 0x00007ffff6a4654d in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00007ffff6a44240 in pthread_cond_broadcast##GLIBC_2.3.2 () from /lib64/libpthread.so.0
#2 0x00007ffff67e1b29 in std::condition_variable::notify_all() () from /lib64/libstdc++.so.6
#3 0x0000000001221177 in add_bytes (data=0x7fffe8d4ba70, byte_count=256,
this=0x7fffc00dbb80) at Client/file.cpp:213
while also owning the lock
(gdb) p lock
$12 = {<std::__mutex_base> = {_M_mutex = {__data = {__lock = 1, __count = 0, __owner = 45177, __nusers = 1, __kind = 0,
__spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}},
with the other thread waiting in the condition variable wait
[Switching to thread 5 (Thread 0x7fffe7d4a700 (LWP 45180))]
#0 0x00007ffff6a43a35 in pthread_cond_wait##GLIBC_2.3.2 () from /lib64/libpthread.so.0
(gdb) bt
#0 0x00007ffff6a43a35 in pthread_cond_wait##GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007ffff67e1aec in std::condition_variable::wait(std::unique_lock<std::mutex>&) () from /lib64/libstdc++.so.6
#2 0x000000000121f9a6 in std::condition_variable::wait<[...]::{lambda()#1}>(std::
unique_lock<std::mutex>&, [...]::{lambda()#1}) (__p=..., __lock=...,
this=0x7fffc00dbb28) at /opt/rh/devtoolset-9/root/usr/include/c++/9/bits/std_mutex.h:104
There are two other threads running under the Process data part, which also hang on pthread_cond_wait, but as far as I'm aware they do not share any synchronization primities (and are just waiting for calls to processor->add_data or processor->finish)
Any ideas what notify_all is waiting for? or ways of finding the culprit?
Edit: I reproduced the code with a dummy processor here:
https://onlinegdb.com/lp36ewyRSP
But, pretty much as expected, this doesn't reproduce the issue, so I assume there is something more intricate going on. Possibly just different timings, but maybe some interaction between condition_variable and OpenMP (used by the real processor) could cause this?
I also encountered the same problem. After doing a few experiments, I found that if the notify_all starts to work after the condition_variable destroying, the notify_all will deadlocks.
See the code below.
#include <iostream>
#include <condition_variable>
#include <thread>
#include <chrono>
std::thread* t;
void test() {
std::condition_variable cv;
std::mutex cv_m;
t = new std::thread([&](){
std::this_thread::sleep_for(std::chrono::seconds(3));
std::cout << "...before notify_all\n";
cv.notify_all();
std::cout << "...after notify_all\n";
});
std::unique_lock<std::mutex> lk(cv_m);
std::cout << "Waiting... \n";
cv.wait(lk, []{return true;});
std::cout << "...finished waiting\n";
}
int main()
{
test();
t->join();
}
On linux:
LSB Version: :core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
Distributor ID: CentOS
Description: CentOS release 6.3 (Final)
Release: 6.3
Codename: Final
uname info:
Linux xxx_name 3.10.0_3-0-0-34 #1 SMP Sun Apr 26 22:58:21 CST 2020 x86_64 x86_64 x86_64 GNU/Linux
Compile the code using gcc 8.2.0:
g++ --std=c++11 test.cpp -o test_cond -lpthread
The program will hang on after outputing "...before notify_all", and nerver reach "...after notify_all".
However, compile the code using gcc 12.1.0 the program will run successfully.
It seems to me that you should unlock your mutex in the producer before the call to notify_all (https://en.cppreference.com/w/cpp/thread/condition_variable)

glibc pthread_join crash when call pthread_join() twice

I have written the following code using the POSIX pthread library:
#include <stdlib.h>
#include <pthread.h>
void *thread_function(void *arg) {
char *code = "1";
}
int main() {
int res;
pthread_t a_thread;
void *thread_result;
res = pthread_create(&a_thread, NULL, thread_function, NULL);
if (res != 0) {
perror("Thread creation failed");
exit(EXIT_FAILURE);
}
sleep(5);
printf("\nWaiting for thread to finish...\n");
res = pthread_join(a_thread, &thread_result);
printf("res[%d]\n",res);
if (res != 0) {
perror("Thread join failed");
exit(EXIT_FAILURE);
}
res = pthread_join(a_thread, &thread_result);
printf("res[%d]\n",res);
exit(EXIT_SUCCESS);
}
On executing the code I got the following output:
Waiting for thread to finish...
res[0]
Segmentation fault (core dumped)
In the code, I want to test What happens if you call the pthread_jion() function
twice after the thread is finished. The first call to the function is correct, and the second crash. The backtrace:
Core was generated by `./a.out'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 pthread_join (threadid=140150565050112, thread_return=0x7fffa0a2c508) at
pthread_join.c:47
47 if (INVALID_NOT_TERMINATED_TD_P (pd))
(gdb) bt
Python Exception exceptions.ImportError No module named gdb.frames:
#0 pthread_join (threadid=140150565050112, thread_return=0x7fffa0a2c508) at
pthread_join.c:47
#1 0x00000000004008d5 in main ()
And I check the pthread_join.c file:
39 int
40 pthread_join (threadid, thread_return)
41 pthread_t threadid;
42 void **thread_return;
43 {
44 struct pthread *pd = (struct pthread *) threadid;
45
46 /* Make sure the descriptor is valid. */
47 if (INVALID_NOT_TERMINATED_TD_P (pd))
48 /* Not a valid thread handle. */
49 return ESRCH;
In the line 47, the Macro definition checks Whether the pd is a valid thread handle. If not, return ESRCH(3).
However when I run the same code in another Linux environment, I got the following output:
Waiting for thread to finish...
res[0]
res[3]
Does it have anything to do with the environment? The two linux systems have same ldd version:
ldd (GNU libc) 2.17
same GLIBC:
GLIBCXX_3.4
GLIBCXX_3.4.1
GLIBCXX_3.4.2
GLIBCXX_3.4.3
GLIBCXX_3.4.4
GLIBCXX_3.4.5
GLIBCXX_3.4.6
GLIBCXX_3.4.7
GLIBCXX_3.4.8
GLIBCXX_3.4.9
GLIBCXX_3.4.10
GLIBCXX_3.4.11
GLIBCXX_3.4.12
GLIBCXX_3.4.13
GLIBCXX_3.4.14
GLIBCXX_3.4.15
GLIBCXX_3.4.16
GLIBCXX_3.4.17
GLIBCXX_3.4.18
GLIBCXX_3.4.19
GLIBCXX_3.4.20
GLIBC_2.3
GLIBC_2.2.5
GLIBC_2.14
GLIBC_2.17
GLIBC_2.3.2
GLIBCXX_FORCE_NEW
GLIBCXX_DEBUG_MESSAGE_LENGT
and the same linux kernel version:
Red Hat Enterprise Linux Server release 6.6 (Santiago)
pthread_join calls pthread_detach after the thread has terminated.
pthread_detach releases all the threads resources when the thread terminates.
From the documentation of pthread_detach
Attempting to detach an already detached thread results in
unspecified behavior.
So you have unspecified behaviour, so you can't guarantee what will happen afterwards.
At the very least, the memory pointed to by the threadid will be freed, leading to accessing freed memory.
In short, don't call pthread_join twice on the same threadid. Why would you want to?
Edit: even simpler: the man page for pthread_join says:
Joining with a thread that has previously been joined results in
undefined behavior.

Thread name not shown in info thread command when using gdb 7.7

In some of the answers to related questions I could see that gdb 7.3 should support displaying thread names atleast with 'info threads' command .
But I am not even getting that luxury. please help me to understand what I am doing wrong.
My sample code used for testing:
#include <stdio.h>
#include <pthread.h>
#include <sys/prctl.h>
static pthread_t ta, tb;
void *
fx (void *param)
{
int i = 0;
prctl (PR_SET_NAME, "Mythread1", 0, 0, 0);
while (i < 1000)
{
i++;
printf ("T1%d ", i);
}
}
void *
fy (void *param)
{
int i = 0;
prctl (PR_SET_NAME, "Mythread2", 0, 0, 0);
while (i < 100)
{
i++;
printf ("T2%d ", i);
}
sleep (10);
/* generating segmentation fault */
int *p;
p = NULL;
printf ("%d\n", *p);
}
int
main ()
{
pthread_create (&ta, NULL, fx, 0);
pthread_create (&tb, NULL, fy, 0);
void *retval;
pthread_join (ta, &retval);
pthread_join (tb, &retval);
return 0;
}
Output( using core dump generated by segmentation fault)
(gdb) core-file core.14001
[New LWP 14003]
[New LWP 14001]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/libthread_db.so.1".
Core was generated by `./thread_Ex'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x08048614 in fy (param=0x0) at thread_Ex.c:30
30 printf("%d\n",*p);
(gdb) info threads
Id Target Id Frame
2 Thread 0xb77d76c0 (LWP 14001) 0x00b95424 in __kernel_vsyscall ()
* 1 Thread 0xb6dd5b70 (LWP 14003) 0x08048614 in fy (param=0x0) at thread_Ex.c:30
(gdb) bt
#0 0x08048614 in fy (param=0x0) at thread_Ex.c:30
#1 0x006919e9 in start_thread () from /lib/libpthread.so.0
#2 0x005d3f3e in clone () from /lib/libc.so.6
(gdb) thread apply all bt
Thread 2 (Thread 0xb77d76c0 (LWP 14001)):
#0 0x00b95424 in __kernel_vsyscall ()
#1 0x006920ad in pthread_join () from /lib/libpthread.so.0
#2 0x080486a4 in main () at thread_Ex.c:50
Thread 1 (Thread 0xb6dd5b70 (LWP 14003)):
#0 0x08048614 in fy (param=0x0) at thread_Ex.c:30
#1 0x006919e9 in start_thread () from /lib/libpthread.so.0
#2 0x005d3f3e in clone () from /lib/libc.so.6
(gdb) q
As you can see I cant see any thread names that I have set. what could be wrong?
Note:
I am using gdb version 7.7 (Downloaded and compiled using no special options)
commands used to compile & install gdb : ./configure && make && make install
As far as I am aware, thread names are not present in the core dump.
If they are available somehow, please file a gdb bug.
I get thread name displayed on CentOS6.5, but not displayed on CentOS6.4 .

Print or examine semaphore count value in GDB

I am trying to implement a thread pool using ACE Semaphore library. It does not provide any API like sem_getvalue which is in Posix semaphore. I need to debug some flow which is not behaving as expected. Can I examine the semaphore in GDB. I am using Centos as OS.
I initialized two semaphores using the default constructor providing count 0 and 10. I have declared them as static in the class and initialized it in the cpp file as
DP_Semaphore ThreadPool::availableThreads(10);
DP_Semaphore ThreadPool::availableWork(0);
But when I am printing the semaphore in GDB using the print command, I am getting the similar output
(gdb) p this->availableWork
$7 = {
sema = {
semaphore_ = {
sema_ = 0x6fe5a0,
name_ = 0x0
},
removed_ = false
}
}
(gdb) p this->availableThreads
$8 = {
sema = {
semaphore_ = {
sema_ = 0x6fe570,
name_ = 0x0
},
removed_ = false
}
}
Is there a tool which can help me here, or shall I switch to Posix thread and re-write all my code.
EDIT: As requested by #timrau the output of call this->availableWork->dump()
(gdb) p this->availableWork.dump()
[Switching to Thread 0x2aaaae97e940 (LWP 28609)]
The program stopped in another thread while making a function call from GDB.
Evaluation of the expression containing the function
(DP_Semaphore::dump()) will be abandoned.
When the function is done executing, GDB will silently stop.
(gdb) call this->availableWork.dump()
[Switching to Thread 0x2aaaaf37f940 (LWP 28612)]
The program stopped in another thread while making a function call from GDB.
Evaluation of the expression containing the function
(DP_Semaphore::dump()) will be abandoned.
When the function is done executing, GDB will silently stop.
(gdb) info threads
[New Thread 0x2aaaafd80940 (LWP 28613)]
6 Thread 0x2aaaafd80940 (LWP 28613) 0x00002aaaac10a61e in __lll_lock_wait_private ()
from /lib64/libpthread.so.0
* 5 Thread 0x2aaaaf37f940 (LWP 28612) ThreadPool::fetchWork (this=0x78fef0, worker=0x2aaaaf37f038)
at ../../CallManager/src/DP_CallControlTask.cpp:1043
4 Thread 0x2aaaae97e940 (LWP 28609) DP_Semaphore::dump (this=0x6e1460) at ../../Common/src/DP_Semaphore.cpp:21
2 Thread 0x2aaaad57c940 (LWP 28607) 0x00002aaaabe01ff3 in __find_specmb () from /lib64/libc.so.6
1 Thread 0x2aaaacb7b070 (LWP 28604) 0x00002aaaac1027c0 in __nptl_create_event () from /lib64/libpthread.so.0
(gdb)
sema.semaphore_.sema_ in your code looks like a pointer. Try to find it's type in the ACE headers, then convert it to a type and print:
(gdb) p *((sem_t)0x6fe570)
Update: try to convert the address within the structure you posted to sem_t. If you use linux, ACE should be using posix semaphores, so type sem_t must be visible to gdb.

Is std::async broken in gcc 4.7 on linux? [closed]

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 9 years ago.
I'm testing std::async in isolation before using it in real code, to verify that it works correctly on my platform (which is ubuntu 12.10 64-bit).
It works (somewhat rarely) and usually just hangs. If it works for you, don't jump to conclusions. Try a few more times, it will probably hang.
If I remove the pthread_mutex test, it doesn't hang. This is the smallest code I can get to reproduce the hang. Is there some reason that you can't mix C pthread code with c++ async code?
#include <iostream>
#include <pthread.h>
#include <chrono>
#include <future>
#include <iomanip>
#include <sstream>
#include <type_traits>
template<typename T>
std::string format_ns(T &&value)
{
std::stringstream s;
if (std::is_floating_point<T>::value)
s << std::setprecision(3);
if (value >= 1000000000)
s << value / 1000000000 << "s";
else if (value >= 1000000)
s << value / 1000000 << "ms";
else if (value >= 1000)
s << value / 1000 << "us";
else
s << value << "ns";
return s.str();
}
template<typename F>
void test(const std::string &msg, int iter, F &&lambda)
{
std::chrono::high_resolution_clock clock;
auto st = clock.now();
int i;
for (i = 0; i < iter; ++i)
lambda();
auto en = clock.now();
std::chrono::nanoseconds dur = std::chrono::duration_cast<
std::chrono::nanoseconds>(en-st);
std::cout << msg << format_ns(dur.count() / i) << std::endl;
}
int test_pthread_mutex()
{
pthread_mutex_t m = PTHREAD_MUTEX_INITIALIZER;
test("pthread_mutex_lock/pthread_mutex_unlock: ", 1000000000,
[&]()
{
pthread_mutex_lock(&m);
pthread_mutex_unlock(&m);
});
pthread_mutex_destroy(&m);
return 0;
}
int test_async()
{
test("async: ", 100,
[&]()
{
auto asy = std::async(std::launch::async, [](){});
asy.get();
});
return 0;
}
int main()
{
test_pthread_mutex();
test_async();
}
Here is the build command line:
g++ -Wextra -Wall --std=c++11 -pthread mutexperf/main.cpp
There are no build output messages.
Here is the output of g++ -v
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.7/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 4.7.2-2ubuntu1' --with-bugurl=file:///usr/share/doc/gcc-4.7/README.Bugs --enable-languages=c,c++,go,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.7 --enable-shared --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.7 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --enable-plugin --enable-objc-gc --disable-werror --with-arch-32=i686 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.7.2 (Ubuntu/Linaro 4.7.2-2ubuntu1)
I tried it on a few different computers and found that it indeed worked fine like #johan commented. I investigated the machine I was using and found evidence that the hard drive is beginning to fail. It has some bad sectors and also saw dmesg report several "hard resets" of the HDD after an unusual 4 second freeze. Odd, I hadn't seen any issues before I posted the question. It's probably some subtle/intermittent corruption when compiling/linking or perhaps when loading the executable.
[44242.380936] ata3.00: exception Emask 0x10 SAct 0x0 SErr 0x800000 action 0x6 frozen
[44242.380942] ata3.00: irq_stat 0x08000000, interface fatal error
[44242.380946] ata3: SError: { LinkSeq }
[44242.380950] sr 2:0:0:0: CDB:
[44242.380952] Get event status notification: 4a 01 00 00 10 00 00 00 08 00
[44242.380965] ata3.00: cmd a0/00:00:00:08:00/00:00:00:00:00/a0 tag 0 pio 16392 in
[44242.380965] res 50/00:03:00:08:00/00:00:00:00:00/a0 Emask 0x10 (ATA bus error)
[44242.380968] ata3.00: status: { DRDY }
[44242.380974] ata3: hard resetting link
[44242.700025] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[44242.704849] ata3.00: configured for UDMA/100
[44242.720055] ata3: EH complete
[44970.117542] ata3.00: exception Emask 0x10 SAct 0x0 SErr 0x800100 action 0x6 frozen
[44970.117547] ata3.00: irq_stat 0x08000000, interface fatal error
[44970.117551] ata3: SError: { UnrecovData LinkSeq }
[44970.117555] sr 2:0:0:0: CDB:
[44970.117557] Get event status notification: 4a 01 00 00 10 00 00 00 08 00
[44970.117570] ata3.00: cmd a0/00:00:00:08:00/00:00:00:00:00/a0 tag 0 pio 16392 in
[44970.117570] res 50/00:03:00:08:00/00:00:00:00:00/a0 Emask 0x10 (ATA bus error)
[44970.117573] ata3.00: status: { DRDY }
[44970.117579] ata3: hard resetting link
[44970.436662] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[44970.443159] ata3.00: configured for UDMA/100
[44970.456639] ata3: EH complete
Thanks to anyone who spent time looking at my issue!
Have you try to copy the mutex rather than pass by reference in lambda?
int test_pthread_mutex()
{
pthread_mutex_t m = PTHREAD_MUTEX_INITIALIZER;
test("pthread_mutex_lock/pthread_mutex_unlock: ", 1000000000,
[=]()
{
pthread_mutex_lock(&m);
pthread_mutex_unlock(&m);
});
pthread_mutex_destroy(&m);
return 0;
}