packaged_task hanging on operator() - c++

Compiling with gcc 4.7.2 on Ubuntu, compiled with -std=c++11 -O0 -pthread, I somehow created a deadlock in code that doesn't seem like it should ever run into that problem. I have a thread which just acquires a lock and then runs through a vector<function<void()>>, calling everything. Meanwhile, the main thread pushes std::packaged_task<int()>s onto it one-by-one and blocks on when that task's future returns. The tasks themselves are trivial (print and return).
Here is the full code. Running the app sometimes succeeds, but within a few tries will hang:
#include <iostream>
#include <future>
#include <thread>
#include <vector>
#include <functional>
std::unique_lock<std::mutex> lock() {
static std::mutex mtx;
return std::unique_lock<std::mutex>{mtx};
}
int main(int argc, char** argv)
{
std::vector<std::function<void()>> messages;
std::atomic<bool> running{true};
std::thread thread = std::thread([&]{
while (running) {
auto lk = lock();
std::cout << "[T] locked with " << messages.size() << " messages." << std::endl;
for (auto& fn: messages) {
fn();
}
messages.clear();
}
});
for (int i = 0; i < 1000000; ++i) {
std::packaged_task<int()> task([=]{
std::cout << "[T] returning " << i << std::endl;
return i;
});
{
auto lk = lock();
messages.emplace_back(std::ref(task));
}
task.get_future().get();
}
running = false;
thread.join();
}
Sample output:
[T] returning 127189
[T] locked with 0 messages.
[T] locked with 0 messages.
[T] locked with 0 messages.
[T] locked with 0 messages.
[T] locked with 0 messages.
[T] locked with 0 messages.
[T] locked with 0 messages.
[T] locked with 1 messages.
[T] returning 127190
[T] locked with 0 messages.
[T] locked with 0 messages.
[T] locked with 0 messages.
[T] locked with 0 messages.
[T] locked with 0 messages.
[T] locked with 1 messages.
[T] returning 127191
[T] locked with 0 messages.
[T] locked with 0 messages.
[T] locked with 0 messages.
[T] locked with 0 messages.
[T] locked with 1 messages.
... hangs forever ...
What's going on? Why does the call into packaged_task::operator() hang? Where is the deadlock? Is this a gcc bug?
[update] Upon deadlock, the two threads are at:
Thread 1 (line 39 is the task.get_future().get() line):
#0 pthread_cond_wait##GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1 0x00007feb01fe800c in __gthread_cond_wait (this=Unhandled dwarf expression opcode 0xf3
)
at [snip]/libstdc++-v3/include/x86_64-unknown-linux-gnu/bits/gthr-default.h:879
#2 std::condition_variable::wait (this=Unhandled dwarf expression opcode 0xf3
) at [snip]/gcc-4.7.2/libstdc++-v3/src/c++11/condition_variable.cc:52
#3 0x0000000000404aff in void std::condition_variable::wait<std::__future_base::_State_base::wait()::{lambda()#1}>(std::unique_lock<std::mutex>&, std::__future_base::_State_base::wait()::{lambda()#1}) (this=0x6111e0, __lock=..., __p=...)
at [snip]gcc-4.7.2/lib/gcc/x86_64-unknown-linux-gnu/4.7.2/../../../../include/c++/4.7.2/condition_variable:93
#4 0x0000000000404442 in std::__future_base::_State_base::wait (this=0x6111a8)
at [snip]gcc-4.7.2/lib/gcc/x86_64-unknown-linux-gnu/4.7.2/../../../../include/c++/4.7.2/future:331
#5 0x00000000004060fb in std::__basic_future<int>::_M_get_result (this=0x7fffc451daa0)
at [snip]gcc-4.7.2/lib/gcc/x86_64-unknown-linux-gnu/4.7.2/../../../../include/c++/4.7.2/future:601
#6 0x0000000000405488 in std::future<int>::get (this=0x7fffc451daa0)
at [snip]gcc-4.7.2/lib/gcc/x86_64-unknown-linux-gnu/4.7.2/../../../../include/c++/4.7.2/future:680
#7 0x00000000004024dc in main (argc=1, argv=0x7fffc451dbb8) at test.cxx:39
and Thread 2 (line 22 is the fn() line):
#0 pthread_once () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_once.S:95
#1 0x00000000004020f6 in __gthread_once (__once=0x611214, __func=0x401e68 <__once_proxy#plt>)
at [snip]/gcc-4.7.2/lib/gcc/x86_64-unknown-linux-gnu/4.7.2/../../../../include/c++/4.7.2/x86_64-unknown-linux-gnu/bits/gthr-default.h:718
#2 0x0000000000404db1 in void std::call_once<void (std::__future_base::_State_base::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()()>&, bool&), std::__future_base::_State_base* const, std::reference_wrapper<std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()()> >, std::reference_wrapper<bool> >(std::once_flag&, void (std::__future_base::_State_base::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()()>&, bool&), std::__future_base::_State_base* const&&, std::reference_wrapper<std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()()> >&&, std::reference_wrapper<bool>&&) (__once=..., __f=#0x7feb014fdc10)
at [snip]/gcc-4.7.2/lib/gcc/x86_64-unknown-linux-gnu/4.7.2/../../../../include/c++/4.7.2/mutex:819
#3 0x0000000000404517 in std::__future_base::_State_base::_M_set_result(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()()>, bool) (this=0x6111a8, __res=..., __ignore_failure=false)
at [snip]/gcc-4.7.2/lib/gcc/x86_64-unknown-linux-gnu/4.7.2/../../../../include/c++/4.7.2/future:362
#4 0x0000000000407af0 in std::__future_base::_Task_state<int ()()>::_M_run() (this=0x6111a8)
at [snip]/gcc-4.7.2/lib/gcc/x86_64-unknown-linux-gnu/4.7.2/../../../../include/c++/4.7.2/future:1271
#5 0x00000000004076cc in std::packaged_task<int ()()>::operator()() (this=0x7fffc451da30)
at [snip]/gcc-4.7.2/lib/gcc/x86_64-unknown-linux-gnu/4.7.2/../../../../include/c++/4.7.2/future:1379
#6 0x000000000040745a in std::_Function_handler<void ()(), std::reference_wrapper<std::packaged_task<int ()()> > >::_M_invoke(std::_Any_data const&) (
__functor=...) at [snip]/gcc-4.7.2/lib/gcc/x86_64-unknown-linux-gnu/4.7.2/../../../../include/c++/4.7.2/functional:1956
#7 0x00000000004051f2 in std::function<void ()()>::operator()() const (this=0x611290)
at [snip]/gcc-4.7.2/lib/gcc/x86_64-unknown-linux-gnu/4.7.2/../../../../include/c++/4.7.2/functional:2311
#8 0x000000000040232f in operator() (__closure=0x611040) at test.cxx:22
#9 0x0000000000403d8e in _M_invoke<> (this=0x611040)
at [snip]/gcc-4.7.2/lib/gcc/x86_64-unknown-linux-gnu/4.7.2/../../../../include/c++/4.7.2/functional:1598
#10 0x0000000000403cdb in operator() (this=0x611040)
at [snip]/gcc-4.7.2/lib/gcc/x86_64-unknown-linux-gnu/4.7.2/../../../../include/c++/4.7.2/functional:1586
#11 0x0000000000403c74 in _M_run (this=0x611028) at [snip]/gcc-4.7.2/lib/gcc/x86_64-unknown-linux-gnu/4.7.2/../../../../include/c++/4.7.2/thread:115
#12 0x00007feb01feae10 in execute_native_thread_routine (__p=Unhandled dwarf expression opcode 0xf3
) at [snip]/gcc-4.7.2/libstdc++-v3/src/c++11/thread.cc:73
#13 0x00007feb018879ca in start_thread (arg=<value optimized out>) at pthread_create.c:300
#14 0x00007feb015e569d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#15 0x0000000000000000 in ?? ()

It seems that the problem is that you destroy the packaged_task possibly before operator() returns in the worker thread. This is most likely undefined behaviour. The program works fine for me if I re-aquire the mutex in the loop after waiting for the future to return a result. This serializes operator() and the destructor of the packaged_task.

I can't explain why your code was broken, but I did find a way to fix it (storing tasks, not std::functions constructed from tasks):
#include <iostream>
#include <future>
#include <thread>
#include <vector>
#include <functional>
#include <unistd.h>
int main(int argc, char** argv)
{
// Let's face it - your lock() function was kinda weird.
std::mutex mtx;
// I've changed this to a vector of tasks, from a vector
// of functions. Seems to have done the job. Not sure exactly
// why but this seems to be the proper way to go.
std::vector<std::packaged_task<int()>> messages;
std::atomic<bool> running{true};
std::thread thread([&]{
while (running) {
std::unique_lock<std::mutex> l{mtx};
std::cout << "[T] locked with " << messages.size() << " messages." << std::endl;
for (auto& fn: messages) {
fn();
}
messages.clear();
}
});
for (int i = 0; i < 1000000; ++i) {
std::packaged_task<int()> task([i]{
std::cout << "[T] returning " << i << std::endl;
return i;
});
// Without grabbing this now, if the thread executed fn()
// before I do f.get() below, it complained about having
// no shared state.
std::future<int> f = task.get_future();
{
std::unique_lock<std::mutex> l{mtx};
messages.emplace_back(std::move(task));
}
f.get();
}
running = false;
thread.join();
}
At the very least, if this code also deadlocks, then it hasn't yet for me.

Related

boost::function deallocation segmentation fault in thread pool

I'm trying to make a thread pool that blocks the main thread until all it's children have completed. The real-world use-case for this is a "Controller" process that spawns independent processes for the user to interact with.
Unfortunately, when the main exits, a segmentation fault is encountered. I cannot figure out the cause of this segmentation fault.
I've authored a Process class which is little more than opening a shell script (called waiter.sh that contains a sleep 5) and waiting for the pid to exit. The Process class is initialized and then the Wait() method is placed in one of the threads in the thread pool.
The problem arises when ~thread_pool() is called. The std::queue cannot properly deallocate the boost::function passed to it, even though the reference to Process is still valid.
#include <sys/types.h>
#include <sys/wait.h>
#include <spawn.h>
#include <queue>
#include <boost/bind.hpp>
#include <boost/thread.hpp>
extern char **environ;
class Process {
private:
pid_t pid;
int status;
public:
Process() : status(0), pid(-1) {
}
~Process() {
std::cout << "calling ~Process" << std::endl;
}
void Spawn(char **argv) {
// want spawn posix and wait for th epid to return
status = posix_spawn(&pid, "waiter.sh", NULL, NULL, argv, environ);
if (status != 0) {
perror("unable to spawn");
return;
}
}
void Wait() {
std::cout << "spawned proc with " << pid << std::endl;
waitpid(pid, &status, 0);
// wait(&pid);
std::cout << "wait complete" << std::endl;
}
};
Below is the thread_pool class. This is loosely adapted from the accepted answer for this question
class thread_pool {
private:
std::queue<boost::function<void() >> tasks;
boost::thread_group threads;
std::size_t available;
boost::mutex mutex;
boost::condition_variable condition;
bool running;
public:
thread_pool(std::size_t pool_size) : available(pool_size), running(true) {
std::cout << "creating " << pool_size << " threads" << std::endl;
for (std::size_t i = 0; i < available; ++i) {
threads.create_thread(boost::bind(&thread_pool::pool_main, this));
}
}
~thread_pool() {
std::cout << "~thread_pool" << std::endl;
{
boost::unique_lock<boost::mutex> lock(mutex);
running = false;
condition.notify_all();
}
try {
threads.join_all();
} catch (const std::exception &) {
// supress exceptions
}
}
template <typename Task>
void run_task(Task task) {
boost::unique_lock<boost::mutex> lock(mutex);
if (0 == available) {
return; //\todo err
}
--available;
tasks.push(boost::function<void()>(task));
condition.notify_one();
return;
}
private:
void pool_main() {
// wait on condition variable while the task is empty and the pool is still
// running
boost::unique_lock<boost::mutex> lock(mutex);
while (tasks.empty() && running) {
condition.wait(lock);
}
// copy task locally and remove from the queue. this is
// done within it's own scope so that the task object is destructed
// immediately after running the task. This is useful in the
// event that the function contains shared_ptr arguments
// bound via 'bind'
{
auto task = tasks.front();
tasks.pop();
lock.unlock();
// run the task
try {
std::cout << "running task" << std::endl;
task();
} catch (const std::exception &) {
// supress
}
}
// task has finished so increment count of availabe threads
lock.lock();
++available;
}
};
Here is the main:
int main() {
// input arguments are not required
char *argv[] = {NULL};
Process process;
process.Spawn(argv);
thread_pool pool(5);
pool.run_task(boost::bind(&Process::Wait, &process));
return 0;
}
The output for this is
creating 5 threads
~thread_pool
I am waiting... (from waiting.sh)
running task
spawned proc with 2573
running task
running task
running task
running task
wait complete
Segmentation fault (core dumped)
And here is the stack trace:
Starting program: /home/jandreau/NetBeansProjects/Controller/dist/Debug/GNU- Linux/controller
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
creating 5 threads
[New Thread 0x7ffff691d700 (LWP 2600)]
[New Thread 0x7ffff611c700 (LWP 2601)]
[New Thread 0x7ffff591b700 (LWP 2602)]
[New Thread 0x7ffff511a700 (LWP 2603)]
[New Thread 0x7ffff4919700 (LWP 2604)]
~thread_pool
running task
running task
spawned proc with 2599
[Thread 0x7ffff611c700 (LWP 2601) exited]
running task
[Thread 0x7ffff591b700 (LWP 2602) exited]
running task
[Thread 0x7ffff511a700 (LWP 2603) exited]
running task
[Thread 0x7ffff4919700 (LWP 2604) exited]
I am waiting...
wait complete
[Thread 0x7ffff691d700 (LWP 2600) exited]
Thread 1 "controller" received signal SIGSEGV, Segmentation fault.
0x000000000040f482 in boost::detail::function::basic_vtable0<void>::clear (
this=0xa393935322068, functor=...)
at /usr/include/boost/function/function_template.hpp:509
509 if (base.manager)
(gdb) where
#0 0x000000000040f482 in boost::detail::function::basic_vtable0<void>::clear (
this=0xa393935322068, functor=...)
at /usr/include/boost/function/function_template.hpp:509
#1 0x000000000040e263 in boost::function0<void>::clear (this=0x62ef50)
at /usr/include/boost/function/function_template.hpp:883
#2 0x000000000040cf20 in boost::function0<void>::~function0 (this=0x62ef50,
__in_chrg=<optimized out>)
at /usr/include/boost/function/function_template.hpp:765
#3 0x000000000040b28e in boost::function<void ()>::~function() (
this=0x62ef50, __in_chrg=<optimized out>)
at /usr/include/boost/function/function_template.hpp:1056
#4 0x000000000041193a in std::_Destroy<boost::function<void ()> >(boost::function<void ()>*) (__pointer=0x62ef50)
at /usr/include/c++/5/bits/stl_construct.h:93
#5 0x00000000004112df in std::_Destroy_aux<false>::__destroy<boost::function<void ()>*>(boost::function<void ()>*, boost::function<void ()>*) (
__first=0x62ef50, __last=0x62ed50)
at /usr/include/c++/5/bits/stl_construct.h:103
#6 0x0000000000410d16 in std::_Destroy<boost::function<void ()>*>(boost::function<void ()>*, boost::function<void ()>*) (__first=0x62edd0, __last=0x62ed50)
at /usr/include/c++/5/bits/stl_construct.h:126
#7 0x0000000000410608 in std::_Destroy<boost::function<void ()>*, boost::function<void ()> >(boost::function<void ()>*, boost::function<void ()>*, std::allocat---Type <return> to continue, or q <return> to quit---
or<boost::function<void ()> >&) (__first=0x62edd0, __last=0x62ed50)
at /usr/include/c++/5/bits/stl_construct.h:151
#8 0x000000000040fac5 in std::deque<boost::function<void ()>, std::allocator<boost::function<void ()> > >::_M_destroy_data_aux(std::_Deque_iterator<boost::function<void ()>, boost::function<void ()>&, boost::function<void ()>*>, std::_Deque_iterator<boost::function<void ()>, boost::function<void ()>&, boost::function<void ()>*>) (this=0x7fffffffdaf0, __first=..., __last=...)
at /usr/include/c++/5/bits/deque.tcc:845
#9 0x000000000040e6e4 in std::deque<boost::function<void ()>, std::allocator<boost::function<void ()> > >::_M_destroy_data(std::_Deque_iterator<boost::function<void ()>, boost::function<void ()>&, boost::function<void ()>*>, std::_Deque_iterator<boost::function<void ()>, boost::function<void ()>&, boost::function<void ()>*>, std::allocator<boost::function<void ()> > const&) (
this=0x7fffffffdaf0, __first=..., __last=...)
at /usr/include/c++/5/bits/stl_deque.h:2037
#10 0x000000000040d0c8 in std::deque<boost::function<void ()>, std::allocator<boost::function<void ()> > >::~deque() (this=0x7fffffffdaf0,
__in_chrg=<optimized out>) at /usr/include/c++/5/bits/stl_deque.h:1039
#11 0x000000000040b3ce in std::queue<boost::function<void ()>, std::deque<boost::function<void ()>, std::allocator<boost::function<void ()> > > >::~queue() (
this=0x7fffffffdaf0, __in_chrg=<optimized out>)
at /usr/include/c++/5/bits/stl_queue.h:96
#12 0x000000000040b6c0 in thread_pool::~thread_pool (this=0x7fffffffdaf0,
---Type <return> to continue, or q <return> to quit---
__in_chrg=<optimized out>) at main.cpp:63
#13 0x0000000000408b60 in main () at main.cpp:140
I'm puzzled by this because the Process hasn't yet gone out of scope and I'm passing a copy of the boost::function<void()> to the thread pool for processing.
Any ideas?
The stack trace indicates that you are destroying a std::function that has not been properly initialized (e.g. some random memory location that is treated as being a std::function) or that you are destroying a std::function twice.
The problem is that your program pushes to tasks only once, but pops five times, hence you remove elements from an empty deque, which is undefined behaviour.
The while loop in pool_main terminates if running is false, and running may be false even if the deque is empty. Then you pop unconditionally. You might consider correcting pool_main as follows:
void pool_main() {
// wait on condition variable
// while the task is empty and the pool is still
// running
boost::unique_lock<boost::mutex> lock(mutex);
while (tasks.empty() && running) {
condition.wait(lock);
}
// copy task locally and remove from the queue. this is
// done within it's own scope so that the task object is destructed
// immediately after running the task. This is useful in the
// event that the function contains shared_ptr arguments
// bound via 'bind'
if (!tasks.empty ()) { // <--- !!!!!!!!!!!!!!!!!!!!!!!!
auto task = tasks.front();
tasks.pop();
lock.unlock();
// run the task
try {
std::cout << "running task" << std::endl;
task();
} catch (const std::exception &) {
// supress
}
}
// task has finished so increment count of availabe threads
lock.lock();
++available;
};
I am, however, not sure whether the logic regarding available is correct. Shouldn't available be decremented on starting the processing of a task and be incremented when it is finished (hence be changed within pool_main only and only within the newly introduced if clause)?
You don't seem to be allocating memory for
extern char **environ;
anywhere. Though wouldn't that be a link error?
Cutting this back to be a minimal reproduction case would help a lot. There's a lot of code here that's presumably not necessary to reproduce the problem.
Also, what is this:
// supress exceptions
If you are getting exceptions while joining your threads, then you presumably haven't joined them all and cleaning up the threads without joining them will cause an error after main exits.

sporadic segfaults when changing label of gtkmm widget

Hy,
i have a gtkmm application, which does some async network-requests, to ask the server for additional properties of the gtk-widgets.
This means for example, that the application should be able to change the label of a widget.
In this example I have created a new widget based on Gtk::ToggleButton.
But I found out that sometimes the gtkmm-application crashes with a segfault. When debuging with gdb I always get the line where i set the label.
For better understanding, I have created a MWE which does the label-changes in a loop, to simulate lots of async-calls:
#include <boost/asio.hpp>
#include <boost/asio/steady_timer.hpp>
#include <iostream>
#include <thread>
#include <mutex>
#include <gtkmm/application.h>
#include <gtkmm/window.h>
#include <gtkmm/togglebutton.h>
class led_label_t : public Gtk::ToggleButton {
public:
using value_list_t = std::vector<Glib::ustring>;
using lock_t = std::lock_guard<std::mutex>;
led_label_t(Glib::ustring label = "<no data>", bool mnemonic = false)
: Gtk::ToggleButton(std::move(label), std::move(mnemonic)),
_values{"SEL1", "SEL2"} {}
protected:
virtual void on_toggled(void) override {
std::cout << "Clicked Button." << std::endl;
lock_t lock(_mtx);
value_changed(_values[get_active()]);
}
virtual void value_changed(Glib::ustring& value) {
std::string path;
if (get_active()) {
path =
"/usr/share/icons/Adwaita/16x16/emblems/emblem-important.png";
} else {
path = "/usr/share/icons/Adwaita/16x16/emblems/emblem-default.png";
}
remove(); // remove previous label
std::cout << "Changed Label of led_label: "
<< ", value: " << value << std::endl;
add_pixlabel(path, value);
}
private:
mutable std::mutex _mtx;
value_list_t _values;
};
int main(void) {
auto app = Gtk::Application::create();
Gtk::Window window;
window.set_default_size(200, 200);
led_label_t inst{};
inst.show();
window.add(inst);
auto f = [&inst, &window]() {
using namespace std::chrono_literals;
boost::asio::io_service io;
{ //wait for startup
boost::asio::steady_timer t{io, 100ms};
t.wait();
}
bool toggle = true;
for (auto i = 0; i < 2000; i++) {
std::cout << "i=" << i << std::endl;
//wait until next simulated button click
boost::asio::steady_timer t{io, 1ms};
t.wait();
inst.set_active(toggle);
toggle = !toggle;
}
};
std::thread c1(f);
std::thread w([&app, &window]() { app->run(window); });
c1.join();
window.hide();
w.join();
return EXIT_SUCCESS;
}
To compile this example, I use following command:
g++ main.cpp -o main `pkg-config --cflags --libs gtkmm-3.0` -Wall -pedantic -Wextra -Werror -Wcast-qual -Wcast-align -Wconversion -fdiagnostics-color=auto -g -O0 -std=c++14 -lboost_system -pthread
I am using GCC 4.9.2 and libgtkmm-3.14 (both standard debian jessie)
The segfault I get is the following:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffe7fff700 (LWP 7888)]
0x00007ffff6288743 in ?? () from /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
(gdb) bt
#0 0x00007ffff6288743 in ?? () from /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#1 0x00007ffff6288838 in ?? () from /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#2 0x00007ffff6267ce9 in ?? () from /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#3 0x00007ffff627241b in ?? () from /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#4 0x00007ffff63a1601 in ?? () from /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#5 0x00007ffff63a154c in ?? () from /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#6 0x00007ffff63a26b8 in ?? () from /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#7 0x00007ffff644d5ff in ?? () from /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#8 0x00007ffff644d9b7 in gtk_widget_realize ()
from /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#9 0x00007ffff644dbe8 in gtk_widget_map ()
from /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#10 0x00007ffff621c387 in ?? () from /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#11 0x00007ffff626270f in ?? () from /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#12 0x00007ffff46bf474 in ?? ()
from /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#13 0x00007ffff46d9087 in g_signal_emit_valist ()
from /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#14 0x00007ffff46d99df in g_signal_emit ()
from /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#15 0x00007ffff644db99 in gtk_widget_map ()
from /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#16 0x00007ffff64506d8 in gtk_widget_set_parent ()
from /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#17 0x00007ffff6217a9b in ?? () from /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#18 0x00007ffff79a44eb in Gtk::Container_Class::add_callback(_GtkContainer*, _GtkWidget*) () from /usr/lib/x86_64-linux-gnu/libgtkmm-3.0.so.1
#19 0x00007ffff46c253b in g_cclosure_marshal_VOID__OBJECTv ()
from /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#20 0x00007ffff46bf474 in ?? ()
from /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#21 0x00007ffff46d9087 in g_signal_emit_valist ()
from /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#22 0x00007ffff46d99df in g_signal_emit ()
from /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#23 0x00007ffff6261aa5 in gtk_container_add ()
from /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#24 0x000000000040b0b5 in led_label_t::value_changed (this=0x7fffffffe2a0,
value=...) at main.cpp:38
#25 0x000000000040afb1 in led_label_t::on_toggled (this=0x7fffffffe2a0)
at main.cpp:24
#26 0x00007ffff7a18af0 in Gtk::ToggleButton_Class::toggled_callback(_GtkToggleButton*) () from /usr/lib/x86_64-linux-gnu/libgtkmm-3.0.so.1
#27 0x00007ffff46bf245 in g_closure_invoke ()
from /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#28 0x00007ffff46d083b in ?? ()
from /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#29 0x00007ffff46d9778 in g_signal_emit_valist ()
from /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#30 0x00007ffff46d99df in g_signal_emit ()
from /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#31 0x00007ffff63ecb4d in ?? () from /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#32 0x00007ffff798a4a0 in Gtk::Button_Class::clicked_callback(_GtkButton*) ()
from /usr/lib/x86_64-linux-gnu/libgtkmm-3.0.so.1
#33 0x00007ffff46bf474 in ?? ()
from /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#34 0x00007ffff46d9087 in g_signal_emit_valist ()
from /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#35 0x00007ffff46d99df in g_signal_emit ()
from /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#36 0x00007ffff63ec936 in gtk_toggle_button_set_active ()
from /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#37 0x0000000000405e12 in <lambda()>::operator()(void) const (
__closure=0x74f4f8) at main.cpp:73
#38 0x000000000040811a in std::_Bind_simple<main()::<lambda()>()>::_M_invoke<>(std::_Index_tuple<>) (this=0x74f4f8) at /usr/include/c++/4.9/functional:1700
#39 0x0000000000407fa9 in std::_Bind_simple<main()::<lambda()>()>::operator()(void) (this=0x74f4f8) at /usr/include/c++/4.9/functional:1688
#40 0x0000000000407e9e in std::thread::_Impl<std::_Bind_simple<main()::<lambda()>()> >::_M_run(void) (this=0x74f4e0) at /usr/include/c++/4.9/thread:115
#41 0x00007ffff3f47970 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#42 0x00007ffff37650a4 in start_thread (arg=0x7fffe7fff700)
at pthread_create.c:309
#43 0x00007ffff349a04d in clone ()
at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
Maybe the Interesting line of this is
#24: 0x000000000040b0b5 in led_label_t::value_changed (this=0x7fffffffe2a0,
value=...) at main.cpp:38)
this is the line where add_pixlabel(path, value); is called.
What am I doing wrong here?
Attention:
This segfault doesn't come always, I found out, that on my desktop-machine I get the error once every 10 calls. (Intel i7-3xxx)
And on my laptop I get the error nearly every call (Intel i5-3xxx)
Now I have found a solution, based on the answer of #user4581301. He was right, that gtkmm doesn't support multithreading. (To be more precise, libsigc++ and sigc::trackable are not thread-safe)
However, care is required when writing programs based on gtkmm using
multiple threads of execution, arising from the fact that libsigc++,
and in particular sigc::trackable, are not thread-safe.
Quote from gtkmm documentation.
Therefore I have used Glib::Dispatcher, to execute the set_label() - method in the context of the gtkmm-Main-Loop of the window.
Here is the code, that did not segfault anymore on my machine(s) (even with many retries)
#include <boost/asio.hpp>
#include <boost/asio/steady_timer.hpp>
#include <cassert>
#include <iostream>
#include <thread>
#include <mutex>
#include <gtkmm/application.h>
#include <gtkmm/window.h>
#include <gtkmm/togglebutton.h>
#include <glibmm/dispatcher.h>
#define LOG() \
std::cout << (std::chrono::system_clock::now() - start).count() << " " \
<< std::this_thread::get_id() << ": "
auto start = std::chrono::system_clock::now();
class led_label_t : public Gtk::ToggleButton {
public:
using value_list_t = std::vector<Glib::ustring>;
using lock_t = std::lock_guard<std::mutex>;
using action_queue_t = std::vector<Glib::ustring>;
led_label_t(Glib::ustring label = "<no data>", bool mnemonic = false)
: Gtk::ToggleButton(std::move(label), std::move(mnemonic)),
_values{"SEL1", "SEL2"} {}
void set_dispatcher(Glib::Dispatcher* dp) {
_dp = dp;
_dp->connect([this](void) { dispatcher_task(); });
}
protected:
virtual void on_toggled(void) override {
LOG() << "Clicked Button." << std::endl;
{
lock_t lock(_action_mtx);
auto value = _values[get_active()];
_action_queue.push_back({value});
LOG() << "Added label into queue " << value << std::endl;
if (_action_queue.size() > 1) {
return;
}
}
_dp->emit();
}
void dispatcher_task(void) {
Glib::ustring label;
for (;;) {
{
lock_t lock(_action_mtx);
if (_action_queue.size() == 0) {
return;
}
label = *_action_queue.begin();
_action_queue.erase(_action_queue.begin());
}
set_label(label);
LOG() << "Set the label " << label << std::endl;
}
}
private:
mutable std::mutex _action_mtx;
action_queue_t _action_queue;
value_list_t _values;
Glib::Dispatcher* _dp;
};
int main(void) {
auto app = Gtk::Application::create();
Gtk::Window window;
window.set_default_size(200, 200);
led_label_t inst{};
inst.show();
window.add(inst);
auto f = [&inst, &window]() {
using namespace std::chrono_literals;
boost::asio::io_service io;
{ // wait for startup
boost::asio::steady_timer t{io, 100ms};
t.wait();
}
bool toggle = true;
for (auto i = 0; i < 200000; i++) {
// wait until next simulated button click
boost::asio::steady_timer t{io, 250us};
t.wait();
LOG() << "i=" << i << std::endl;
inst.set_active(toggle);
toggle = !toggle;
LOG() << "finished" << std::endl;
}
};
std::thread c1(f);
std::thread w([&app, &window, &inst]() {
Glib::Dispatcher dp;
inst.set_dispatcher(&dp);
app->run(window);
});
c1.join();
window.hide();
w.join();
return EXIT_SUCCESS;
}
Accessing and changing UI components from multiple threads is always tricky. UIs need to be fast and responsive to user input, so they can't hang around for background tasks to complete. As a result UI components are rarely protected by mutex or other synchronization. You write, it happens. Except when something else gets in the way.
If you write from two threads... Ooops.
You're half way through a write when another thread reads... Ooops.
Say for example Thread 4 is part way through writing a new string into the label when a screen refresh is triggered. If the backend for label is a c-style string, the terminating NULL may have been overwritten and the label write runs off the end into bad RAM.
All sorts of things could go wrong, and some will be survivable or, worse, look like it. You're better off having all of the UI management in one thread and have the other threads queue updates to the UI thread. Start by looking into Model View Controller and then try related patterns if needed.

GCC's TSAN reports a data race with a thread safe static local

I wrote the following toy example:
std::map<char, size_t> getMap(const std::string& s)
{
std::map<char, size_t> map;
size_t i = 0;
for (const char * b = s.data(), *end = b + s.size(); b != end; ++b)
{
map[*b] = i++;
}
return map;
}
void check(const std::string& s)
{
//The creation of the map should be thread safe according to the C++11 rules.
static const auto map = getMap("12abcd12ef");
//Now we can read the map concurrently.
size_t n = 0;
for (const char* b = s.data(), *end = b + s.size(); b != end; ++b)
{
auto iter = map.find(*b);
if (iter != map.end())
{
n += iter->second;
}
}
std::cout << "check(" << s << ")=" << n << std::endl;
}
int main()
{
std::thread t1(check, "abc");
std::thread t2(check, "def");
t1.join();
t2.join();
return 0;
}
According to the C++11 standard, this should not contain any data race (cf. this post).
However TSAN with gcc 4.9.2, reports a data race:
==================
WARNING: ThreadSanitizer: data race (pid=14054)
Read of size 8 at 0x7f409f5a3690 by thread T2:
#0 TestServer::check(std::string const&) <null>:0 (TestServer+0x0000000cc30a)
#1 std::thread::_Impl<std::_Bind_simple<void (*(char const*))(std::string const&)> >::_M_run() <null>:0 (TestServer+0x0000000cce37)
#2 execute_native_thread_routine ../../../../../gcc-4.9.2/libstdc++-v3/src/c++11/thread.cc:84 (libstdc++.so.6+0x0000000b5bdf)
Previous write of size 8 at 0x7f409f5a3690 by thread T1:
#0 TestServer::getMap(std::string const&) <null>:0 (TestServer+0x0000000cc032)
#1 TestServer::check(std::string const&) <null>:0 (TestServer+0x0000000cc5dd)
#2 std::thread::_Impl<std::_Bind_simple<void (*(char const*))(std::string const&)> >::_M_run() <null>:0 (TestServer+0x0000000cce37)
#3 execute_native_thread_routine ../../../../../gcc-4.9.2/libstdc++-v3/src/c++11/thread.cc:84 (libstdc++.so.6+0x0000000b5bdf)
Location is global 'TestServer::check(std::string const&)::map' of size 48 at 0x7f409f5a3680 (TestServer+0x00000062b690)
Thread T2 (tid=14075, running) created by main thread at:
#0 pthread_create ../../../../gcc-4.9.2/libsanitizer/tsan/tsan_interceptors.cc:877 (libtsan.so.0+0x000000047c03)
#1 __gthread_create /home/Guillaume/Compile/objdir/x86_64-unknown-linux-gnu/libstdc++-v3/include/x86_64-unknown-linux-gnu/bits/gthr-default.h:662 (libstdc++.so.6+0x0000000b5d00)
#2 std::thread::_M_start_thread(std::shared_ptr<std::thread::_Impl_base>) ../../../../../gcc-4.9.2/libstdc++-v3/src/c++11/thread.cc:142 (libstdc++.so.6+0x0000000b5d00)
#3 TestServer::main() <null>:0 (TestServer+0x0000000ae914)
#4 StarQube::runSuite(char const*, void (*)()) <null>:0 (TestServer+0x0000000ce328)
#5 main <null>:0 (TestServer+0x0000000ae8bd)
Thread T1 (tid=14074, finished) created by main thread at:
#0 pthread_create ../../../../gcc-4.9.2/libsanitizer/tsan/tsan_interceptors.cc:877 (libtsan.so.0+0x000000047c03)
#1 __gthread_create /home/Guillaume/Compile/objdir/x86_64-unknown-linux-gnu/libstdc++-v3/include/x86_64-unknown-linux-gnu/bits/gthr-default.h:662 (libstdc++.so.6+0x0000000b5d00)
#2 std::thread::_M_start_thread(std::shared_ptr<std::thread::_Impl_base>) ../../../../../gcc-4.9.2/libstdc++-v3/src/c++11/thread.cc:142 (libstdc++.so.6+0x0000000b5d00)
#3 TestServer::main() <null>:0 (TestServer+0x0000000ae902)
#4 StarQube::runSuite(char const*, void (*)()) <null>:0 (TestServer+0x0000000ce328)
#5 main <null>:0 (TestServer+0x0000000ae8bd)
SUMMARY: ThreadSanitizer: data race ??:0 TestServer::check(std::string const&)
==================
What is wrong here ?
is TSan buggy ? (When I am using Clang's toolchain, I get no data race report)
does GCC emit code which is not thread safe? (I am not using -fno-threadsafe-statics though)
is my understanding of static locals incorrect?
is TSan buggy ? (When I am using Clang's toolchain, I get no data race report)
does GCC emit code which is not thread safe? (I am not using -fno-threadsafe->statics though)
is my understanding of static locals incorrect?
I believe this is bug in gcc part that generate code for tsan purposes.
I try this:
#include <thread>
#include <iostream>
#include <string>
std::string message()
{
static std::string msg("hi");
return msg;
}
int main()
{
std::thread t1([]() { std::cout << message() << "\n"; });
std::thread t2([]() { std::cout << message() << "\n"; });
t1.join();
t2.join();
}
If look at code generate by clang and gcc, all good,
__cxa_guard_acquire is called in both cases for path that init static local variable. But in case of check that we need init msg or not we have problem.
The code looks like this
if (atomic_flag/*uint8_t*/) {
lock();
call_constructor_of_msg();
unlock();
}
in case of clang callq __tsan_atomic8_load was generated,
but in the case of gcc it generate callq __tsan_read1.
Note that this calls annotate real memory operations,
not do operations by itself.
so it at runtime tsan runtime library thinks that all bad,
and we have data race, I report problem here:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68338
and looks like it fixed in trunk, but not in current stable release of gcc - 5.2

high cpu usage in boost::asio::io_service::run

I have encountered a strange problem with boost::asio::io_service::run. Sometimes this run function seems to eat the whole cpu(100%), and sometimes not. I am not very clear about the pattern.
the relevant code:
class Asio {
public:
Asio() :
io_service_(new boost::asio::io_service),
run_() {}
void Start() {
if (!asio_thread_.joinable()) {
run_ = true;
asio_thread_ = std::thread([=] {
Run();
});
}
}
boost::asio::io_service* io_service() { return io_service_.get(); }
protected:
void Run() {
for (;;) {
boost::system::error_code ec;
io_service()->run(ec);
if (run_) {
io_service()->reset();
std::this_thread::sleep_for(std::chrono::milliseconds(100));
} else {
break;
}
}
}
protected:
std::unique_ptr<boost::asio::io_service> io_service_;
std::thread asio_thread_;
std::atomic<bool> run_;
};
When the run function runs normally, below is the callstack
#0 0x00000035f74e9163 in epoll_wait () from /lib64/libc.so.6
#1 0x0000000000b3f6ef in boost::asio::detail::epoll_reactor::run(bool, boost::asio::detail::op_queue<boost::asio::detail::task_io_service_operation>&) ()
#2 0x0000000000b40111 in boost::asio::detail::task_io_service::do_run_one(boost::asio::detail::scoped_lock<boost::asio::detail::posix_mutex>&, boost::asio::detail::task_io_service_thread_info&, boost::system::error_code const&) ()
#3 0x0000000000b3feaf in boost::asio::detail::task_io_service::run(boost::system::error_code&) ()
#4 0x0000000000b403fd in boost::asio::io_service::run(boost::system::error_code&) ()
#5 0x0000000000b3ddc1 in Asio::Run() ()
When the run function behaves abnormally, below is the callstack:
#0 0x00000031bbee53c9 in syscall () from /lib64/libc.so.6
#1 0x00007f831d1d3d68 in std::chrono::_V2::steady_clock::now() () from /usr/local/gcc48/lib64/libstdc++.so.6
#2 0x0000000000b45b6d in boost::asio::detail::chrono_time_traits<std::chrono::_V2::steady_clock, boost::asio::wait_traits<std::chrono::_V2::steady_clock> >::now() ()
#3 0x0000000000b45608 in boost::asio::detail::timer_queue<boost::asio::detail::chrono_time_traits<std::chrono::_V2::steady_clock, boost::asio::wait_traits<std::chrono::_V2::steady_clock> > >::get_ready_timers(boost::asio::detail::op_queue<boost::asio::detail::task_io_service_operation>&) ()
#4 0x0000000000b3f5d7 in boost::asio::detail::timer_queue_set::get_ready_timers(boost::asio::detail::op_queue<boost::asio::detail::task_io_service_operation>&) ()
#5 0x0000000000b3f815 in boost::asio::detail::epoll_reactor::run(bool, boost::asio::detail::op_queue<boost::asio::detail::task_io_service_operation>&) ()
#6 0x0000000000b40111 in boost::asio::detail::task_io_service::do_run_one(boost::asio::detail::scoped_lock<boost::asio::detail::posix_mutex>&, boost::asio::detail::task_io_service_thread_info&, boost::system::error_code const&) ()
#7 0x0000000000b3feaf in boost::asio::detail::task_io_service::run(boost::system::error_code&) ()
#8 0x0000000000b403fd in boost::asio::io_service::run(boost::system::error_code&) ()
#9 0x0000000000b3ddc1 in Asio::Run() ()
In both cases, there're some pending handlers in the io_service, so the io_service::run should not return and should be wait for the event to happen.
Any advise is welcome.
I did further check, it's seems it's due to the boost::asio::steady_timer used. The usage of steady_timer involves the usage of the following pattern:
boost::asio::steady_timer timer;
timer.expires_at(some_expiry, error_code)
timer.async_wait((=)(boost::system::error_code ec) {
// some operation
timer.expires_at(new_expiry, error_code);
timer.asyn_wait(...);
});
Where the timer is wrapped in a shared pointer, and it's safe to copy into the lamda function.

Crash related to boost::function usage in thread pool

I am trying to implement thread pool in C++ using pthread. I want to encapsulate logic related to threads management in one object which is taking ownership of these threads. That means whenever this object is destroyed, threads must be stopped and cleaned up.
I've been testing my code and it turns out that I get segmentation fault when I destroy WorkerThreadManager object while there is boost::function called. See the code and backtrace from GDB. I don't really understand why it happens, as far as I know boost::function is copyable, so once I get a copy of it from the queue, I can pop() it and even destroy whole queue (I prooved that in some small test) and then call the function's copy.
WorkerThreadManager.h:
#include "WorkerThreadManagerInterface.h"
#include "utils/mutex.h"
#include <queue>
#include <semaphore.h>
#include <iostream>
class WorkerThreadManager : public WorkerThreadManagerInterface
{
public:
WorkerThreadManager(unsigned threadsNumber = 5);
virtual ~WorkerThreadManager();
virtual void PushTask(thread_function_t A_threadFun, result_function_t A_resultFun);
void SignalResults();
private:
static void* WorkerThread(void* A_data);
void PushResult(int A_result, result_function_t A_resultFun);
typedef boost::function<void ()> signal_function_t;
struct worker_thread_data_t
{
worker_thread_data_t(thread_function_t A_threadFun, result_function_t A_resultFun) :
threadFun(A_threadFun), resultFun(A_resultFun) {}
worker_thread_data_t() {}
thread_function_t threadFun;
result_function_t resultFun;
};
const unsigned m_threadsNumber;
pthread_t* m_pthreads;
utils::Mutex m_tasksMutex;
sem_t m_tasksSem;
std::queue<worker_thread_data_t> m_tasks;
utils::Mutex m_resultsMutex;
std::queue<signal_function_t> m_results;
};
WorkerThreadManager.cpp:
#include "WorkerThreadManager.h"
#include "gateway_log.h"
#include <pthread.h>
/**
* #brief Creates semaphore and starts threads.
*/
WorkerThreadManager::WorkerThreadManager(unsigned threadsNumber) : m_threadsNumber(threadsNumber)
{
if ( sem_init(&m_tasksSem, 0, 0) )
{
std::stringstream ss;
ss << "Semaphore could not be initialized: " << errno << " - " << strerror(errno);
LOG_FATAL(ss);
throw std::runtime_error(ss.str());
}
m_pthreads = new pthread_t[m_threadsNumber];
for (unsigned i = 0; i < m_threadsNumber; ++i)
{
int rc = pthread_create(&m_pthreads[i], NULL, WorkerThreadManager::WorkerThread, (void*) this );
if(rc)
{
std::stringstream ss;
ss << "Pthread could not be started: " << errno << " - " << strerror(errno);
LOG_FATAL(ss.str());
if ( sem_destroy(&m_tasksSem) )
LOG_ERROR("Semaphore could not be destroyed: " << errno << " - " << strerror(errno));
delete [] m_pthreads;
throw std::runtime_error(ss.str());
}
else
{
LOG_DEBUG("Worker thread started " << m_pthreads[i]);
if(pthread_detach(m_pthreads[i]))
LOG_WARN("Failed to detach worker thread");
}
}
}
/**
* #brief Cancels all threads, destroys semaphore
*/
WorkerThreadManager::~WorkerThreadManager()
{
LOG_DEBUG("~WorkerThreadManager()");
for(unsigned i = 0; i < m_threadsNumber; ++i)
{
if ( pthread_cancel(m_pthreads[i]) )
LOG_ERROR("Worker thread cancellation failed");
}
if ( sem_destroy(&m_tasksSem) )
LOG_ERROR("Semaphore could not be destroyed: " << errno << " - " << strerror(errno));
delete [] m_pthreads;
}
/**
* #brief Adds new task to queue, so worker threads can
* #param A_threadFun function which will be executed by thread
* #param A_resultFun function which will be enqueued for calling with return value of A_threadFun as parameter
* after worker thread executes A_threadFun.
*/
void WorkerThreadManager::PushTask(thread_function_t A_threadFun, result_function_t A_resultFun)
{
utils::ScopedLock mutex(m_tasksMutex);
worker_thread_data_t data(A_threadFun, A_resultFun);
m_tasks.push( data );
sem_post(&m_tasksSem);
LOG_DEBUG("Task for worker threads has been added to queue");
}
/**
* #brief Executes result functions (if there are any) to give feedback
* to classes which requested task execution in worker thread.
*/
void WorkerThreadManager::SignalResults()
{
while(true)
{
signal_function_t signal;
{
utils::ScopedLock mutex(m_resultsMutex);
if(m_results.size())
{
signal = m_results.front();
m_results.pop();
}
else
return;
}
signal();
}
}
/**
* #brief Enqueues result of function executed in worker thread.
* #param A_result return value of function executed in worker thread
* #param A_resultFun function which will be enqueued for calling with A_result as a parameter.
*/
void WorkerThreadManager::PushResult(int A_result, result_function_t A_resultFun)
{
utils::ScopedLock mutex(m_resultsMutex);
signal_function_t signal = boost::bind(A_resultFun, A_result);
m_results.push( signal );
}
/**
* #brief worker thread body
* #param A_data pointer to WorkerThreadManager instance
*/
void* WorkerThreadManager::WorkerThread(void* A_data)
{
WorkerThreadManager* manager = reinterpret_cast<WorkerThreadManager*>(A_data);
LOG_DEBUG("Starting worker thread loop");
while (1)
{
if ( -1 == sem_wait(&manager->m_tasksSem) && errno == EINTR )
{
LOG_DEBUG("sem_wait interrupted with signal");
continue;
}
LOG_DEBUG("WorkerThread:::::: about to call lock mutex");
worker_thread_data_t data;
{
utils::ScopedLock mutex(manager->m_tasksMutex);
data = manager->m_tasks.front();
manager->m_results.pop();
}
LOG_DEBUG("WorkerThread:::::: about to call resultFun");
int result = data.threadFun();
LOG_DEBUG("WorkerThread:::::: after call resultFun");
pthread_testcancel();
manager->PushResult(result, data.resultFun);
}
return NULL;
}
main.cpp:
#include "gateway_log.h"
#include "WorkerThreadManager.h"
#include <memory>
class A {
public:
int Fun() { LOG_DEBUG("Fun before sleep"); sleep(8); LOG_DEBUG("Fun after sleep");return 0; }
void Result(int a) { LOG_DEBUG("Result: " << a); }
};
int main()
{
sd::auto_ptr<WorkerThreadManager> workerThreadManager = new WorkerThreadManager;
A a;
workerThreadManager->PushTask(boost::bind(&A::Fun, &a), boost::bind(&A::Result, &a, _1));
sleep(3);
LOG_DEBUG("deleting workerThreadManager");
workerThreadManager.reset(); // <<<--- CRASH
LOG_DEBUG("deleted workerThreadManager");
sleep(10);
LOG_DEBUG("after sleep");
return 0;
}
GDB:
(gdb) bt
#0 0xb7ad33a0 in ?? () from /lib/i386-linux-gnu/libc.so.6
#1 0x0807d3a7 in boost::function0<void>::clear (this=0x858db48) at /home/marcin/intel_build/boost_1_42_0/boost/function/function_template.hpp:856
#2 0x0807d17b in boost::function0<void>::~function0 (this=0x858db48, __in_chrg=<optimized out>) at /home/marcin/intel_build/boost_1_42_0/boost/function/function_template.hpp:752
#3 0x0807cec5 in boost::function<void()>::~function(void) (this=0x858db48, __in_chrg=<optimized out>) at /home/marcin/intel_build/boost_1_42_0/boost/function/function_template.hpp:1043
#4 0x0807ced8 in std::_Destroy<boost::function<void ()> >(boost::function<void ()>*) (__pointer=0x858db48) at /usr/include/c++/4.6/bits/stl_construct.h:94
#5 0x0807c868 in std::_Destroy_aux<false>::__destroy<boost::function<void ()>*>(boost::function<void ()>*, boost::function<void ()>*) (__first=0x858db48, __last=0x858d928) at /usr/include/c++/4.6/bits/stl_construct.h:104
#6 0x0807bd05 in std::_Destroy<boost::function<void ()>*>(boost::function<void ()>*, boost::function<void ()>*) (__first=0x858d938, __last=0x858d928) at /usr/include/c++/4.6/bits/stl_construct.h:127
#7 0x0807af23 in std::_Destroy<boost::function<void ()>*, boost::function<void ()> >(boost::function<void ()>*, boost::function<void ()>*, std::allocator<boost::function<void ()> >&) (__first=0x858d938, __last=0x858d928)
at /usr/include/c++/4.6/bits/stl_construct.h:153
#8 0x0807a037 in std::deque<boost::function<void ()>, std::allocator<boost::function<void ()> > >::_M_destroy_data_aux(std::_Deque_iterator<boost::function<void ()>, boost::function<void ()>&, boost::function<void ()>*>, std::_Deque_iterator<boost::function<void ()>, boost::function<void ()>&, boost::function<void ()>*>) (this=0x858beec, __first=..., __last=...) at /usr/include/c++/4.6/bits/deque.tcc:795
#9 0x08076153 in std::deque<boost::function<void ()>, std::allocator<boost::function<void ()> > >::_M_destroy_data(std::_Deque_iterator<boost::function<void ()>, boost::function<void ()>&, boost::function<void ()>*>, std::_Deque_iterator<boost::function<void ()>, boost::function<void ()>&, boost::function<void ()>*>, std::allocator<boost::function<void ()> > const&) (this=0x858beec, __first=..., __last=...) at /usr/include/c++/4.6/bits/stl_deque.h:1816
#10 0x08073411 in std::deque<boost::function<void()>, std::allocator<boost::function<void()> > >::~deque(void) (this=0x858beec, __in_chrg=<optimized out>) at /usr/include/c++/4.6/bits/stl_deque.h:898
#11 0x0806a355 in std::queue<boost::function<void()>, std::deque<boost::function<void()>, std::allocator<boost::function<void()> > > >::~queue(void) (this=0x858beec, __in_chrg=<optimized out>)
at /usr/include/c++/4.6/bits/stl_queue.h:92
#12 0x0815a054 in WorkerThreadManager::~WorkerThreadManager (this=0x858be98, __in_chrg=<optimized out>) at WorkerThreadManager.cpp:42
#13 0x0815a1e3 in WorkerThreadManager::~WorkerThreadManager (this=0x858be98, __in_chrg=<optimized out>) at WorkerThreadManager.cpp:56
#14 0x080c6c51 in std::auto_ptr<WorkerThreadManager>::reset (this=0x85463e4, __p=0x0) at /usr/include/c++/4.6/backward/auto_ptr.h:244
#15 0x080604a9 in main ()
I would really appreciate any help.
There is no guarantee that pthread_cancel waits for the cancellation of the target completes before it returns. When successful, it simply requests cancellation, but does not wait for it to complete. You need to use pthread_join to wait for the threads to have completed.
I suspect that as the destructor is proceeding in one thread, one of the threads wakes up (due to the sem_destroy), and erroneously attempts to read/pop the queue. I'm not sure why it's causing a crash in the main thread, but I would eliminate this potential issue first.
Finally, I would highly recommend you move some of these semaphore and thread mechanisms to their own classes, to make the code more exception safe.
I found a bug, it was trivial - shame on me :(
In function void* WorkerThreadManager::WorkerThread(void* A_data) I popped m_results queue instead of m_tasks as I had intended:
worker_thread_data_t data;
{
utils::ScopedLock mutex(manager->m_tasksMutex);
data = manager->m_tasks.front();
manager->m_results.pop();
}
Anyway I don't really understand why it caused crash so late - in destructor of the queue.