This is related to an issue I have been discussing here and here, but as my investigations have led me away from the STL as the potential issue, and towards "new" as my nemisis, I thought it best to start a new thread.
To reiterate, I am using an arm-linux cross compiler (version 2.95.2) supplied by the embedded platform vendor.
When I run the application below on my Linux PC, it of course works never fails. However when running it on the embedded device, I get segmentation faults every time. Using "malloc" never fails. Synchronising the "new" allocation using the mutex will stop the issue, but this is not practical in my main application.
Can anyone suggest why this might be occurring, or have any ideas how I can get around this issue?
Thanks.
#include <stdio.h>
#include <pthread.h>
pthread_mutex_t _logLock = PTHREAD_MUTEX_INITIALIZER;
static void* Thread(void *arg)
{
int i = 0;
while (i++ < 500)
{
// pthread_mutex_lock(&_logLock);
char* myDyn = (char*) new char[1023];
// char* buffer = (char*) malloc(1023);
// if (buffer == NULL)
// printf("Out of mem\n");
// free(buffer);
delete[] myDyn;
//pthread_mutex_unlock(&_logLock);
}
pthread_exit(NULL);
}
int main(int argc, char** argv)
{
int threads = 50;
pthread_t _rx_thread[threads];
for (int i = 0; i < threads; i++)
{
printf("Start Thread: %i\n", i);
pthread_create(&_rx_thread[i], NULL, Thread, NULL);
}
for (int i = 0; i < threads; i++)
{
pthread_join(_rx_thread[i], NULL);
printf("End Thread: %i\n", i);
}
}
If the heap on your device isn't thread-safe, then you need to lock. You could just write your own new and delete functions that lock for the duration of new or delete -- you don't need to hold the lock across the whole lifetime of the allocated memory.
Check to see if there are compiler switches to make the allocator thread-safe.
As others have stated, chances are that the toolset's default memory allocation behavior is not thread-safe. I just checked with the 2 ARM cross-development toolsets I use the most, and indeed this is the case with one of them.
Most toolsets offer ways of making the libraries thread-safe, either by re-implementing functions or by linking in a different (thread-safe) version of the library. Without more information from you, it's hard to say what your particular toolset is doing.
Hate to say it, but the best information is probably in your toolset documentation.
Yeah, new is probably not threadsafe. You need synchronization mechanisms around the memory allocation and, separately, around the deletion. Look into the Boost.thread library, which provides mutex types that should help you out.
How about using malloc (you say it never fails on the embedded platform) to get the required memory then using placement new (void* operator new[] (std::size_t size, void* ptr) throw()) assuming it is available, for construction. See
new[] operator
Also see stackoverflow article
and
MSDN
Related
what happens to data created in local scope of thread if thread is terminated, memory leak?
void MyThread()
{
auto* ptr = new int[10];
while (true)
{
// stuff
}
// thread is interrupted before this delete
delete[] ptr;
}
Okay, my perspective.
If the program exits, the threads exit wherever they are. They don't clean up. But in this case you don't care. You might care if it's an open file and you want it flushed.
However, I prefer a way to tell my threads to exit cleanly. This isn't perfect, but instead of while (true) you can do while (iSHouldRun) and set the field to false when it's time for the thread to exit.
You can also set a flag that says, iAmExiting at the end, then myThread.join() once the flag is set. That gives your exit code a chance to clean up nicely.
Coding this from the beginning helps when you write your unit tests.
The other thing -- as someone mentioned in comments -- use RAII. Pretty much if you're using raw pointers, you're doing something you shouldn't do in modern C++.
That's not an absolute. You can write your own RAII classes. For instance:
class MyIntArray {
MyArray(int sizeIn) { ... }
~MyArray() { delete array; }
private:
int * array = nullptr;
int size = 0;
};
You'll need a few more methods to actually get to the data, like an operator[]. Now, this isn't any different than using std::vector, so it's only an example of how to implement RAII for your custom data, for instance.
But your functions should NEVER call new like this. It's old-school. If your method pukes somehow, you have a memory leak. If it pukes on exit(), no one cares. But if it pukes for another reason, it's a problem. RAII is a much, much better solution than the other patterns.
Why shouldn't I unlock a mutex from a different thread? In the c++ standard it says it pretty clearly: If the mutex is not currently locked by the calling thread, it causes undefined behavior. But as far as I can see, everything works as expected on Linux(Fedora 31 with GCC). I seriously tried everything but I could not get it to behave strangely.
All I'm asking for is an example where something, literally anything is affected by unlocking a mutex from a different thread.
Here is a quick test I wrote which is super wrong and shoudn't work but it does:
std::mutex* testEvent;
int main()
{
testEvent = new std::mutex[1000];
for(uint32_t i = 0; i < 1000; ++i) testEvent[i].lock();
std::thread threads[2000];
auto lock = [](uint32_t index) ->void { testEvent[index].lock(); assert(!testEvent[index].try_lock()); };
auto unlock = [](uint32_t index) ->void { testEvent[index].unlock(); };
for(uint32_t j = 0; j < 1000; ++j)
{
for(uint32_t i = 0; i < 1000; ++i)
{
threads[i] = std::thread(lock,i);
threads[i+1000] = std::thread(unlock,i);
}
for(uint32_t i = 0; i < 2000; ++i)
{
threads[i].join();
}
std::cout << j << std::endl;
}
delete[] testEvent;
}
As you already said, it is UB. UB means it may work. Or not. Or randomly switch between working and making your computer sing itself a lullaby. (See also "nasal demons".)
Here are just a few ways someone can break your program on Fedora 31 with GCC on x86-64:
Compile with -fsanitize=thread. It will now crash every time, which is still a valid C++ implementation, because UB.
Run unter helgrind (valgrind --tool=helgrind ./a.out). It will crash every time -- still a valid way to host a C++ program, because UB.
The libstdc++/glibc/pthread implementation on the target system switches from using "fast" mutexes by default to "error checking" or "recursive" mutexes (https://manpages.debian.org/jessie/glibc-doc/pthread_mutex_init.3.en.html). Note that this is probably possible in a manner that is ABI-compatible with your program, meaning that it does not even have to be recompiled for it to suddenly stop working.
That being said, since you are using a platform on which the C++ mutex boils down to a futex-implemented "fast" pthread mutex, this does not work by accident. It is just not guaranteed to keep working for any time, or in any circumstance that actually checks if you are doing the right thing.
I really wonder why you would want to do this in the first place ;)
Normally you would want to have something like
lock();
do_critical_task();
unlock();
(In c++, the lock/unlock is often hidden by use of std::lock_guard or similar.)
Let's assume one thread (lets say Thread A) called this code and is inside the critical task, i.e. it is also holding the lock.
Then if you unlock the same mutex from another thread, any thread other than A can also enter the critical section simultaneously.
The main purpose of mutexes is to have mutual exclusion (hence their name), so all you would do is to erase the purpose of the mutex ;)
That said: you should always believe the standard. Only if something works out on a certain system it doesn't mean it's portable. Plus: especially in a concurrent context, a lot of things can work out a thousand times but then fail the 1001'th time as of race conditions.
In mathematics your attempt would be comparable to "proof by example".
I posted a previous question "Seg Fault when using std::string on an embedded Linux platform" where I got some very useful advise. I have been away on other projects since then and have recently returned to looking at this issue.
To reiterate, I am restricted to using the arm-linux cross compiler (version 2.95.2) as this is what is supplied and supported by the embedded platform vendor. I understand that the issue is likely because the stdlib is very old, and not particularly thread safe.
The problem is that whenever I use the STL containers in multiple threads, I end up with a segmentation fault. The code below will consistently seg fault unless I use pthread_mutex_lock and scope operators around the container declarations (as in other post).
It is not feasible to use this approach in my application as I pass the containers around to different methods and classes. I would ideally like to solve this problem, or find a suitable alternative. I have tried STLPort and SGI's Standard Template Library with the same results. I can only assume that because they are being linked by the very old gcc, they cannot solve the problem.
Does anyone have any possible recommendations or solutions? Or perhaps you can suggest an implementation of vector (and string) that I can drop into my code?
Thanks in advance for any guidance.
#include <stdio.h>
#include <vector>
#include <list>
#include <string>
using namespace std;
/////////////////////////////////////////////////////////////////////////////
class TestSeg
{
static pthread_mutex_t _logLock;
public:
TestSeg()
{
}
~TestSeg()
{
}
static void* TestThread( void *arg )
{
int i = 0;
while ( i++ < 10000 )
{
printf( "%d\n", i );
WriteBad( "Function" );
}
pthread_exit( NULL );
}
static void WriteBad( const char* sFunction )
{
//pthread_mutex_lock( &_logLock );
//{
printf( "%s\n", sFunction );
string sKiller; // <----------------------------------Bad
//list<char> killer; // <----------------------------------Bad
//vector<char> killer; // <----------------------------------Bad
//}
//pthread_mutex_unlock( &_logLock );
return;
}
void RunTest()
{
int threads = 100;
pthread_t _rx_thread[threads];
for ( int i = 0 ; i < threads ; i++ )
{
pthread_create( &_rx_thread[i], NULL, TestThread, NULL );
}
for ( int i = 0 ; i < threads ; i++ )
{
pthread_join( _rx_thread[i], NULL );
}
}
};
pthread_mutex_t TestSeg::_logLock = PTHREAD_MUTEX_INITIALIZER;
int main( int argc, char *argv[] )
{
TestSeg seg;
seg.RunTest();
pthread_exit( NULL );
}
The issue is not with the containers, it's with your code.
It is completely unnecessary to make the containers themselves threadsafe, because what you need, first and foremost, is transaction-like semantics.
Let's assume, for the sake of demonstration, that you have a threadsafe implementation of vector, for example.
Thread 1: if (!vec.empty())
Thread 2: vec.clear();
Thread 1: foo = vec.front();
This leads to undefined behavior.
The issue is that having each operation on the container threadsafe is pretty much pointless because you are still required to be able to lock the container itself for several operations in a row. Therefore you would lock for your various operations, and then lock again on each and every operation ?
As I said: completely unnecessary.
Part of your query might be answered in another thread. The design of C++, including the standard library, is influenced by many factors. Efficiency is a repeated theme. Thread safety mechanisms often are at odds with an objective of efficiency. The age of the library is not really the issue.
For your situation, you may be able to wrap the STL vector in your own vector class (you might consider a Decorator) that contains the locking mechanism and provides the lock/unlock logic around accesses.
I have an application that is parallellized using pthreads. The application has a iterative routine call and a thread spawn within the rountine (pthread_create and pthread_join) to parallelize the computation intensive section in the routine. When I use an instrumenting tool like PIN to collect the statistics the tool reports statistics for several threads(no of threads x no of iterations). I beleive it is because it is spawning new set of threads each time the routine is called.
How can I ensure that I create the thread only once and all successive calls use the threads that have been created first.
When I do the same with OpenMP and then try to collect the statistics, I see that the threads are created only once. Is it beacause of the OpenMP runtime ?
EDIT:
im jus giving a simplified version of the code.
int main()
{
//some code
do {
compute_distance(objects,clusters, &delta); //routine with pthread
} while (delta > threshold )
}
void compute_distance(double **objects,double *clusters, double *delta)
{
//some code again
//computation moved to a separate parallel routine..
for (i=0, i<nthreads;i++)
pthread_create(&thread[i],&attr,parallel_compute_phase,(void*)&ip);
for (i=0, i<nthreads;i++)
rc = pthread_join(thread[i], &status);
}
I hope this clearly explains the problem.
How do we save the thread id and test if was already created?
You can make a simple thread pool implementation which creates threads and makes them sleep. Once a thread is required, instead of "pthread_create", you can ask the thread pool subsystem to pick up a thread and do the required work.. This will ensure your control over the number of threads..
An easy thing you can do with minimal code changes is to write some wrappers for pthread_create and _join. Basically you can do something like:
typedef struct {
volatile int go;
volatile int done;
pthread_t h;
void* (*fn)(void*);
void* args;
} pthread_w_t;
void* pthread_w_fn(void* args) {
pthread_w_t* p = (pthread_w_t*)args;
// just let the thread be killed at the end
for(;;) {
while (!p->go) { pthread_yield(); }; // yields are good
p->go = 0; // don't want to go again until told to
p->fn(p->args);
p->done = 1;
}
}
int pthread_create_w(pthread_w_t* th, pthread_attr_t* a,
void* (*fn)(void*), void* args) {
if (!th->h) {
th->done = 0;
th->go = 0;
th->fn = fn;
th->args = args;
pthread_create(&th->h,a,pthread_w_fn,th);
}
th->done = 0; //make sure join won't return too soon
th->go = 1; //and let the wrapper function start the real thread code
}
int pthread_join_w(pthread_w_t*th) {
while (!th->done) { pthread_yield(); };
}
and then you'll have to change your calls and pthread_ts, or create some #define macros to change pthread_create to pthread_create_w etc....and you'll have to init your pthread_w_ts to zero.
Messing with those volatiles can be troublesome though. you'll probably need to spend some time getting my rough outline to actually work properly.
To ensure something that several threads might try to do only happens once, use pthread_once(). To ensure something only happens once that might be done by a single thread, just use a bool (likely one in static storage).
Honestly, it would be far easier to answer your question for everyone if you would edit your question – not comment, since that destroys formatting – to contain the real code in question, including the OpenMP pragmas.
I am trying to create a thread in C++ (Win32) to run a simple method. I'm new to C++ threading, but very familiar with threading in C#. Here is some pseudo-code of what I am trying to do:
static void MyMethod(int data)
{
RunStuff(data);
}
void RunStuff(int data)
{
//long running operation here
}
I want to to call RunStuff from MyMethod without it blocking. What would be the simplest way of running RunStuff on a separate thread?
Edit: I should also mention that I want to keep dependencies to a minimum. (No MFC... etc)
#include <boost/thread.hpp>
static boost::thread runStuffThread;
static void MyMethod(int data)
{
runStuffThread = boost::thread(boost::bind(RunStuff, data));
}
// elsewhere...
runStuffThread.join(); //blocks
C++11 available with more recent compilers such as Visual Studio 2013 has threads as part of the language along with quite a few other nice bits and pieces such as lambdas.
The include file threads provides the thread class which is a set of templates. The thread functionality is in the std:: namespace. Some thread synchronization functions use std::this_thread as a namespace (see Why the std::this_thread namespace? for a bit of explanation).
The following console application example using Visual Studio 2013 demonstrates some of the thread functionality of C++11 including the use of a lambda (see What is a lambda expression in C++11?). Notice that the functions used for thread sleep, such as std::this_thread::sleep_for(), uses duration from std::chrono.
// threading.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include <iostream>
#include <chrono>
#include <thread>
#include <mutex>
int funThread(const char *pName, const int nTimes, std::mutex *myMutex)
{
// loop the specified number of times each time waiting a second.
// we are using this mutex, which is shared by the threads to
// synchronize and allow only one thread at a time to to output.
for (int i = 0; i < nTimes; i++) {
myMutex->lock();
std::cout << "thread " << pName << " i = " << i << std::endl;
// delay this thread that is running for a second.
// the this_thread construct allows us access to several different
// functions such as sleep_for() and yield(). we do the sleep
// before doing the unlock() to demo how the lock/unlock works.
std::this_thread::sleep_for(std::chrono::seconds(1));
myMutex->unlock();
std::this_thread::yield();
}
return 0;
}
int _tmain(int argc, _TCHAR* argv[])
{
// create a mutex which we are going to use to synchronize output
// between the two threads.
std::mutex myMutex;
// create and start two threads each with a different name and a
// different number of iterations. we provide the mutex we are using
// to synchronize the two threads.
std::thread myThread1(funThread, "one", 5, &myMutex);
std::thread myThread2(funThread, "two", 15, &myMutex);
// wait for our two threads to finish.
myThread1.join();
myThread2.join();
auto fun = [](int x) {for (int i = 0; i < x; i++) { std::cout << "lambda thread " << i << std::endl; std::this_thread::sleep_for(std::chrono::seconds(1)); } };
// create a thread from the lambda above requesting three iterations.
std::thread xThread(fun, 3);
xThread.join();
return 0;
}
CreateThread (Win32) and AfxBeginThread (MFC) are two ways to do it.
Either way, your MyMethod signature would need to change a bit.
Edit: as noted in the comments and by other respondents, CreateThread can be bad.
_beginthread and _beginthreadex are the C runtime library functions, and according to the docs are equivalent to System::Threading::Thread::Start
Consider using the Win32 thread pool instead of spinning up new threads for work items. Spinning up new threads is wasteful - each thread gets 1 MB of reserved address space for its stack by default, runs the system's thread startup code, causes notifications to be delivered to nearly every DLL in your process, and creates another kernel object. Thread pools enable you to reuse threads for background tasks quickly and efficiently, and will grow or shrink based on how many tasks you submit. In general, consider spinning up dedicated threads for never-ending background tasks and use the threadpool for everything else.
Before Vista, you can use QueueUserWorkItem. On Vista, the new thread pool API's are more reliable and offer a few more advanced options. Each will cause your background code to start running on some thread pool thread.
// Vista
VOID CALLBACK MyWorkerFunction(PTP_CALLBACK_INSTANCE instance, PVOID context);
// Returns true on success.
TrySubmitThreadpoolCallback(MyWorkerFunction, context, NULL);
// Pre-Vista
DWORD WINAPI MyWorkerFunction(PVOID context);
// Returns true on success
QueueUserWorkItem(MyWorkerFunction, context, WT_EXECUTEDEFAULT);
Simple threading in C++ is a contradiction in terms!
Check out boost threads for the closest thing to a simple approach available today.
For a minimal answer (which will not actually provide you with all the things you need for synchronization, but answers your question literally) see:
http://msdn.microsoft.com/en-us/library/kdzttdcb(VS.80).aspx
Also static means something different in C++.
Is this safe:
unsigned __stdcall myThread(void *ArgList) {
//Do stuff here
}
_beginthread(myThread, 0, &data);
Do I need to do anything to release the memory (like CloseHandle) after this call?
Another alternative is pthreads - they work on both windows and linux!
CreateThread (Win32) and AfxBeginThread (MFC) are two ways to do it.
Be careful to use _beginthread if you need to use the C run-time library (CRT) though.
For win32 only and without additional libraries you can use
CreateThread function
http://msdn.microsoft.com/en-us/library/ms682453(VS.85).aspx
If you really don't want to use third party libs (I would recommend boost::thread as explained in the other anwsers), you need to use the Win32API:
static void MyMethod(int data)
{
int data = 3;
HANDLE hThread = ::CreateThread(NULL,
0,
&RunStuff,
reinterpret_cast<LPVOID>(data),
0,
NULL);
// you can do whatever you want here
::WaitForSingleObject(hThread, INFINITE);
::CloseHandle(hThread);
}
static DWORD WINAPI RunStuff(LPVOID param)
{
int data = reinterpret_cast<int>(param);
//long running operation here
return 0;
}
There exists many open-source cross-platform C++ threading libraries you could use:
Among them are:
Qt
Intel
TBB Boost thread
The way you describe it, I think either Intel TBB or Boost thread will be fine.
Intel TBB example:
class RunStuff
{
public:
// TBB mandates that you supply () operator
void operator ()()
{
// long running operation here
}
};
// Here's sample code to instantiate it
#include <tbb/tbb_thread.h>
tbb::tbb_thread my_thread(RunStuff);
Boost thread example:
http://www.ddj.com/cpp/211600441
Qt example:
http://doc.trolltech.com/4.4/threads-waitconditions-waitconditions-cpp.html
(I dont think this suits your needs, but just included here for completeness; you have to inherit QThread, implement void run(), and call QThread::start()):
If you only program on Windows and dont care about crossplatform, perhaps you could use Windows thread directly:
http://www.codersource.net/win32_multithreading.html