Pthread_create seems to flush the argument passed to the input function - c++

I am working on a pthread code to do repeated matrix vector product. While doing so, I first wrote the serial matrix vector code for multiplication and then later I attempted to put the matrix vector product into separate threads.
The code https://github.com/viswans/parallel-computing-cs525/blob/pthread/pthread_page_rank/src/pthread/pagerankPthread.cpp does what I just described. Particularly when the number of threads is increased from 8 to 9, the binary results in a segmentation fault.
On debugging using gdb I noticed that there was a null pointer being dereferenced, and I added a watch point on that pointer to see if it is being set properly. What I noticed was that the argument to the function being called from pthread_create seems to be flushed and set to 0!
Old value = 37843
New value = 45242576
0x0000000000403436 in __gnu_cxx::new_allocator<(anonymous namespace)::ThreadStruct>::construct<(anonymous namespace)::ThreadStruct, (anonymous namespace)::ThreadStruct> (this=0x2b25970, __p=0x2b260e0) at /usr/lib/gcc/x86_64-pc-linux-gnu/4.8.4/include/g++-v4/ext/new_allocator.h:120
120 { ::new((void *)__p) _Up(std::forward<_Args>(__args)...); }
(gdb) c
Continuing.
[New Thread 0x7ffff2985700 (LWP 3390)]
[New Thread 0x7ffff2184700 (LWP 3391)]
[New Thread 0x7ffff1983700 (LWP 3392)]
[New Thread 0x7ffff1182700 (LWP 3393)]
Hardware watchpoint 3: *(0x2b260e8)
Old value = 45242576
New value = 0
0x00007ffff708eedb in __memset_sse2 () from /lib64/libc.so.6
(gdb) bt
#0 0x00007ffff708eedb in __memset_sse2 () from /lib64/libc.so.6
#1 0x00007ffff7ded2e2 in allocate_dtv () from /lib64/ld-linux-x86-64.so.2
#2 0x00007ffff7ded9be in _dl_allocate_tls () from /lib64/ld-linux-x86-64.so.2
#3 0x00007ffff7bc9fc5 in pthread_create##GLIBC_2.2.5 () from /lib64/libpthread.so.0
#4 0x0000000000402b47 in PageRank::PageRankPthread::calculatePageRank (matrix=std::shared_ptr (count 1, weak 0) 0x2b258d0,�
input=std::vector of length 196591, capacity 196591 = {...}, num_threads=9, criterion=...) at src/pthread/pagerankPthread.cpp:84
#5 0x0000000000401d5d in mainPthread (argc=3, argv=0x7fffffffe6b8) at src/pthread/mainPthread.cpp:31
#6 0x000000000040be47 in main (argc=3, argv=0x7fffffffe6b8) at src/main.cpp:9
Any insight about why pthread_create would flush the arguments would be much appreciated.
Thanks
Sudharshan

You call push_back on the tstruct vector, which invalidates all pointers into that vector, causing the threads to access structures that have moved. One simple fix is to add tstruct.reserve(num_threads); after std::vector< ThreadStruct > tstruct;.
But you should really rethink this and do things in a more sensible way. Is a vector of structures a suitable collection to use when you need a pointer into the collection to remain valid as the collection is modified?

Related

How to do thread safe shared_ptr modification and access?

Goal: I want to modify internal information and access this information from many threads synchronously as fast as possible
I simplified code bellow, but this is how I tried to achieve this.
I have 2 shared pointers.
One is called m_mutable_data and the other is called m_const_data.
m_mutable_data is updated in strand guarded way. m_const_data is updated with contents of m_mutable_data every 60s also in the strand guarded way.
This is the only place m_const_data shared pointer is reset with new data. m_const_data is read synchronously by many threads, 1000+ times per second.
Code
class black_list_container : public std::enable_shared_from_this<black_list_container>
{
struct meta_data
{
bool blacked;
}
struct black_list_data
{
std::unordered_map<uint32_t,meta_data> data;
}
public:
#pragma optimize( "", off )
bool is_blacked(uint32_t id)
{
// This call is called from many different threads (1000+ calls per second)
// should be synchronous and as fast as possible
auto c = m_const_data;
return c->data[id].blacked;
}
#pragma optimize( "", on )
#pragma optimize( "", off )
void update_const_data()
{
// Called internaly by timer every 60s to update m_const_data with contents of m_mutable_data
// Guarded with strand
m_strand->post([self{shared_from_this()}]{
auto snapshot = new black_list_data();
snapshot->data = m_mutable_data->data;
m_const_data.reset(snapshot);
});
}
#pragma optimize( "", on )
private:
void internal_modification_mutable_data()
{
// Called internaly by different metrics
// Guarded with strand
m_strand->post([self{shared_from_this()}]{
// .... do some modification on internal m_mutable_data
});
}
boost::asio::io_context::strand m_strand;
std::shared_ptr<black_list_data> m_mutable_data;
std::shared_ptr<black_list_data> m_const_data;
};
Very, very seldom this code crashes in method 'is_blacked' on line
auto c = m_const_data;
This is the backtrace
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `./STRATUM-01'.
Program terminated with signal 6, Aborted.
#0 0x00007fe09aaf1387 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-307.el7.1.x86_64 libgcc-4.8.5-39.el7.x86_64 libstdc++-4.8.5-39.el7.x86_64
(gdb) bt
#0 0x00007fe09aaf1387 in raise () from /lib64/libc.so.6
#1 0x00007fe09aaf2a78 in abort () from /lib64/libc.so.6
#2 0x00007fe09ab33ed7 in __libc_message () from /lib64/libc.so.6
#3 0x00007fe09ab3c299 in _int_free () from /lib64/libc.so.6
#4 0x00000000005fae36 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x7fe0440aeaa0) at /opt/rh/devtoolset-7/root/usr/include/c++/7/bits/shared_ptr_base.h:154
#5 0x00000000006b9205 in ~__shared_count (this=<synthetic pointer>, __in_chrg=<optimized out>) at /opt/rh/devtoolset-7/root/usr/include/c++/7/bits/shared_ptr_base.h:684
#6 ~__shared_ptr (this=<synthetic pointer>, __in_chrg=<optimized out>) at /opt/rh/devtoolset-7/root/usr/include/c++/7/bits/shared_ptr_base.h:1123
#7 ~shared_ptr (this=<synthetic pointer>, __in_chrg=<optimized out>) at /opt/rh/devtoolset-7/root/usr/include/c++/7/bits/shared_ptr.h:93
#8 black_list_container_impl::is_blacked (this=0x7fe08c287e50, id=23654) at /var/lib/jenkins/workspace/validator/src/black_list_container.cpp:69
I'm not exactly sure why destruction of shared_ptr is called in frame #7
Obviously I did not achieve my goal so please direct me into pattern that actually achieves my goal in thread safe way.
I know I could have used
std::atomic<std::shared_ptr<black_list_data>> m_const_data;
but would not this affect performance while reading from many different threads?
I think I found the answer to my question in this article Atomic Smart Pointers.
So I have to change code in update_const_data() to
auto snapshot = std::make_shared<black_list_data>();
snapshot->data = m_mutable_data->data;
std::atomic_store(&m_const_data, snapshot);
and code in is_blacked() to
auto c = std::atomic_load(&m_const_data);

Why 'cout' statement printed twice (even it is synchrinized) from a particular thread if pthread_join() is not used?

#include < iostream >
#include < pthread.h >
using namespace std;
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
void* Func(void *)
{
pthread_mutex_lock(&mutex);
cout << "First thread execution" << endl;
pthread_mutex_unlock(&mutex);
}
int main()
{
pthread_t th1;
pthread_create(&th1, NULL, Func, NULL);
pthread_mutex_lock(&mutex);
cout << "In main thread" << endl;
pthread_mutex_lock(&mutex);
// pthread_join(th1, NULL); // Note this code is commented
return 0;
}
I have executed following program on linux fedora 22 (also on http://www.cpp.sh/) around 20 times, and out of 20 execution I have found following outputs:-
Output1:
In main thread
First thread execution
Output2:
First thread execution
In main thread
Output3:
In main thread
Output4:
In main thread
First thread execution
First thread execution
Output 1 to 3 are expected as main thread is not waiting child thread to exit. Execution sequence of both the threads (main and child) is fully dependent on Kernel thread scheduling.
But output 4 is strange !!! First thread execution gets printed two times !!!
Now if I run program after un-commentting code 'pthread_join(th1, NULL)' or add 'pthread_exit(NULL)', I do not get strange output (i.e. First thread execution never printed twice) ever, even I run code 10000 times.
My questions to experts are:
Without pthread_join/pthread_exit what is happening behind the scene so that First thread execution have got printed 2 times?
Responsibility of pthread_join is to get exit code of a particular thread, and after successful call of pthread_join, kernel will free resources of that particular thread. If I do not call pthread_join on a joinable thread then it will result in resource leak, but why above mentioned strange behavior ??
We might say, this is un-defined behavior, but it would be great if any expert provide technical explanation on this.
How pthread_join/pthread_exit can prevent above mentioned strange behavior ? What hidden thing it is doing here due to that strange behavior doesn't appear ?
Thanks to experts in advance..
I've observed this kind of double printing in a similar situation. While your thread was waiting in the write system call doing its normal output, specifically, in this stack:
#0 0x00007ffff78f4640 in write () from /lib64/libc.so.6
#1 0x00007ffff788fb93 in _IO_file_write () from /lib64/libc.so.6
#2 0x00007ffff788fa72 in new_do_write () from /lib64/libc.so.6
#3 0x00007ffff7890e05 in _IO_do_write () from /lib64/libc.so.6
#4 0x00007ffff789114f in _IO_file_overflow () from /lib64/libc.so.6
the program was terminated normally, normal termination caused the output subsystem to flush all buffers. The output buffer on stdin was not yet marked free (the write system call didn't return yet), so it was written out again:
#0 0x00007ffff78f4640 in write () from /lib64/libc.so.6
#1 0x00007ffff788fb93 in _IO_file_write () from /lib64/libc.so.6
#2 0x00007ffff788fa72 in new_do_write () from /lib64/libc.so.6
#3 0x00007ffff7890e05 in _IO_do_write () from /lib64/libc.so.6
#4 0x00007ffff7890140 in _IO_file_sync () from /lib64/libc.so.6
#5 0x00007ffff7891f56 in _IO_default_setbuf () from /lib64/libc.so.6
#6 0x00007ffff7890179 in _IO_file_setbuf () from /lib64/libc.so.6
#7 0x00007ffff7892703 in _IO_cleanup () from /lib64/libc.so.6
#8 0x00007ffff78512f8 in __run_exit_handlers () from /lib64/libc.so.
In any case, join your threads (If you used C++ threads, it would have reminded you to do that) or otherwise synchronize access to the output stream.
The main thread might end earlier then the spawned thread.
Ending the main thread implies ending the whole process, along with all threads being brought down abruptly. This might invoke undefined behaviour, thus anything can happen.
To get around this
either join the spawned thread using pthread_join() from main(),
or end the main thread using pthread_exit(), which just ends the main thread and keeps the process from being ended.

QSerialPort::readAll() leads to SIGSEGV / SIGABRT if called in a while-loop

I'm communicating with a hardware device using QSerialPort. New data does not emit the "readyRead"-Signal, so I decided to write a read thread using QThread.
This is the code:
void ReadThread::run()
{
while(true){
readData();
if (buffer.size() > 0) parseData();
}
}
and
void ReadThread::readData()
{
buffer.append(device->readAll();
}
with buffer being an private QByteArray and device being a pointer to the QSerialPort. ParseData will parse the data and emit some signals. Buffer is cleared when parseData is left.
This works, however after some time (sometimes 10 seconds, sometimes 1 hour) the program crashes with SIGSEGV with the following trace:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff3498700 (LWP 24870)]
malloc_consolidate (av=av#entry=0x7fffec000020) at malloc.c:4151
(gdb) bt
#0 malloc_consolidate (av=av#entry=0x7fffec000020) at malloc.c:4151
#1 0x00007ffff62c2ee8 in _int_malloc (av=av#entry=0x7fffec000020, bytes=bytes#entry=32769) at malloc.c:3423
#2 0x00007ffff62c4661 in _int_realloc (av=av#entry=0x7fffec000020, oldp=oldp#entry=0x7fffec0013b0, oldsize=oldsize#entry=64, nb=nb#entry=32784) at malloc.c:4286
#3 0x00007ffff62c57b9 in __GI___libc_realloc (oldmem=0x7fffec0013c0, bytes=32768) at malloc.c:3029
#4 0x00007ffff70d1cdd in QByteArray::reallocData(unsigned int, QFlags<QArrayData::AllocationOption>) () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#5 0x00007ffff70d1f07 in QByteArray::resize(int) () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#6 0x00007ffff799f9fc in free (bytes=<optimized out>, this=0x609458)
at ../../include/QtSerialPort/5.3.2/QtSerialPort/private/../../../../../src/serialport/qt4support/include/private/qringbuffer_p.h:140
#7 read (maxLength=<optimized out>, data=<optimized out>, this=0x609458)
at ../../include/QtSerialPort/5.3.2/QtSerialPort/private/../../../../../src/serialport/qt4support/include/private/qringbuffer_p.h:326
#8 QSerialPort::readData (this=<optimized out>, data=<optimized out>, maxSize=<optimized out>) at qserialport.cpp:1341
#9 0x00007ffff722bdf0 in QIODevice::read(char*, long long) () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#10 0x00007ffff722cbaf in QIODevice::readAll() () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#11 0x00007ffff7bd0741 in readThread::readData (this=0x6066c0) at ../reader.cpp:212
#12 0x00007ffff7bc80d0 in readThread::run (this=0x6066c0) at ../reader.cpp:16
#13 0x00007ffff70cdd2e in ?? () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#14 0x00007ffff6e1c0a4 in start_thread (arg=0x7ffff3498700) at pthread_create.c:309
#15 0x00007ffff632f04d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
I'm not sure how to reproduce the problem correctly, since it appears randomly. If I comment out the "readData()" in my while loop, the crashes do not appear anymore (of course no data can be parsed, then).
Does anyone have a clue what this could be?
What is the buffer? Could it be, another thread is reading the data from the buffer and clears it afterwards?
Try to lock it (and all other data shared between threads) e.g. with a mutex
QMutex mx; // could be also member of the ReadThread class
void ReadThread::readData()
{
mx.lock();
buffer.append(device->readAll();
mx.unlock();
}
And do the same in the code which reads and clears the buffer from another thread (I'm not doing the assumption, that this is parseData())
Another possibility could be, parseData() calls some code running in GUI-Thread. This doesn't work in Qt4 and probably also in Qt5
You're using the instance of a QObject from multiple threads at once. This generally speaking leads to undefined behavior, as you've just seen. QSerialPort will work just fine on the GUI thread. Only once you get it to work there, you can move it to a worker thread.
Note that if the event loop (app.exec() call in main() or QThread::run()) isn't executing, the signals won't be happening. It looks as if you tried to write pseudo synchronous code and have (predictably) failed. Don't do that.
Something like this is supposed to work:
#include <QtCore>
#include <QtSerialPort>
int main(int argc, char ** argv) {
QCoreApplication app(argc, argv);
QSerialPort port;
port.setPortName(...);
port.setBaudRate(...);
... // etc
if (! port.open(QIODevice::ReadWrite)) {
qWarning() << "can't open the port";
return 1;
}
... // set the port
connect(&port, &QIODevice::readyRead, [&]{
qDebug() << "got" << port.readAll().size() << "bytes";
});
return app.exec(); // the signals will be emitted from here
}
Ensure that all serial port related objects are initialized and used only in the separate thread. Send received data or parsed events to the UI thread by using signal/slot-mechanism.
Note also that if you inherit QThread in readThread, the constructor may be executed in the UI thread and other functions in the readThread. In that case, start the readThread and run separate initialization function before other functions (for example, by sending proper signal from the UI thread).

std::vector<std::mutex> stuck with optimization

I wrote a program which uses massive parallel execution. I am working with an Array of objects and an Array of mutexes for synchronization. My code Looks something like this:
std::vector<MyObject> objects;
std::vector<std::mutex> mutexes;
void work(int data)
{
for(unsigned int i = 0; i < objects.size(); ++i)
{
//Check if data Needs to be processed for objects[i]
if(dontNeedToProcess)continue;
mutexes[i].lock();
//Work with data for objects[i]
mutexes[i].unlock();
}
}
The function "work" is called by multiple threads with different data. After some hours (sometimes even days) the program is stuck. When I run it with gdb I can see that the program hangs while locking the mutex.
The Problem is now that I compiled the progrem with optimization (-O2) and the mutexes and "i" are optimized out.
Is it possible that the optimization causes this behavior when using an Array of mutexes?
Edit:
All the threads are at the same Position. The Backtrace Looks like the following:
#0 0xb7779d3c in __kernel_vsyscall ()
#1 0xb67ff672 in __lll_lock_wait ()
at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:144
#2 0xb67fb1c2 in _L_lock_920 ()
from /lib/i386-linux-gnu/i686/cmov/libpthread.so.0
#3 0xb67fb043 in __GI___pthread_mutex_lock (mutex=0x9de56530)
at ../nptl/pthread_mutex_lock.c:114
#4 0xb69043c4 in pthread_mutex_lock (mutex=0x9de56530) at forward.c:192
#5 0xb72754f3 in pthread_mutex_lock ()
from /usr/lib/i386-linux-gnu/libasan.so.1
#6 0x0809ac85 in __gthread_mutex_lock (__mutex=<optimized out>)
at /usr/include/i386-linux-gnu/c++/4.9/bits/gthr-default.h:748
#7 lock (this=<optimized out>) at /usr/include/c++/4.9/mutex:135
...

What could cause a dynamic_cast to crash?

I have a piece of code looking like this :
TAxis *axis = 0;
if (dynamic_cast<MonitorObjectH1C*>(obj))
axis = (dynamic_cast<MonitorObjectH1C*>(obj))->GetXaxis();
Sometimes it crashes :
Thread 1 (Thread -1208658240 (LWP 11400)):
#0 0x0019e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1 0x048c67fb in __waitpid_nocancel () from /lib/tls/libc.so.6
#2 0x04870649 in do_system () from /lib/tls/libc.so.6
#3 0x048709c1 in system () from /lib/tls/libc.so.6
#4 0x001848bd in system () from /lib/tls/libpthread.so.0
#5 0x0117a5bb in TUnixSystem::Exec () from /opt/root/lib/libCore.so.5.21
#6 0x01180045 in TUnixSystem::StackTrace () from /opt/root/lib/libCore.so.5.21
#7 0x0117cc8a in TUnixSystem::DispatchSignals ()
from /opt/root/lib/libCore.so.5.21
#8 0x0117cd18 in SigHandler () from /opt/root/lib/libCore.so.5.21
#9 0x0117bf5d in sighandler () from /opt/root/lib/libCore.so.5.21
#10 <signal handler called>
#11 0x0533ddf4 in __dynamic_cast () from /usr/lib/libstdc++.so.6
I have no clue why it crashes. obj is not null (and if it was it would not be a problem, would it ?).
What could be the reason for a dynamic cast to crash ?
If it can't cast, it should just return NULL no ?
Some possible reasons for the crash:
obj points to an object with a non-polymorphic type (a class or struct with no virtual methods, or a fundamental type).
obj points to an object that has been freed.
obj points to unmapped memory, or memory that has been mapped in such a way as to generate an exception when accessed (such as a guard page or inaccessible page).
obj points to an object with a polymorphic type, but that type was defined in an external library that was compiled with RTTI disabled.
Not all of these problems necessarily cause a crash in all situations.
I suggest using a different syntax for this code snippet.
if (MonitorObjectH1C* monitorObject = dynamic_cast<MonitorObjectH1C*>(obj))
{
axis = monitorObject->GetXaxis();
}
You can still crash if some other thread is deleting what monitorObject points to or if obj is crazy garbage, but at least your problem isn't casting related anymore and you're not doing the dynamic_cast twice.
As it crashes only sometimes, i bet it's a threading issue. Check all references to 'obj':
grep -R 'obj.*=' .
dynamic_cast will return 0 if the cast fails and you are casting to a pointer, which is your case. The problem is that you have either corrupted the heap earlier in your code, or rtti wasn't enabled.
Are you sure that the value of 'obj' has been correctly defined?
If for example it is uninitialised (ie random) them I could see it causing a crash.
Can the value of obj be changed by a different thread?