I have a libpthread linked application. The core of the application are two FIFOs shared by four threads ( two threads per one FIFO that is ;). The FIFO class is synchronized using pthread mutexes and it stores pointers to big classes ( containing buffers of about 4kb size ) allocated inside static memory using overloaded new and delete operators ( no dynamic allocation here ).
The program itself usually works fine, but from time to time it segfaults for no visible reason. The problem is, that I can't debug the segfaults properly as I'm working on an embedded system with an old linux kernel (2.4.29) and g++ (gcc version egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)).
There's no gdb on the system, and I can't run the application elsewhere ( it's too hardware specific ).
I compiled the application with -g and -rdynamic flags, but an external gdb tells me nothing when I examine the core file ( only hex addresses ) - still I can print the backtrace from the program after catching SIGSEGV - it always looks like this:
Backtrace for process with pid: 6279
-========================================-
[0x8065707]
[0x806557a]
/lib/libc.so.6(sigaction+0x268) [0x400bfc68]
[0x8067bb9]
[0x8067b72]
[0x8067b25]
[0x8068429]
[0x8056cd4]
/lib/libpthread.so.0(pthread_detach+0x515) [0x40093b85]
/lib/libc.so.6(__clone+0x3a) [0x4015316a]
-========================================-
End of backtrace
So it seems to be pointing to libpthread...
I ran some of the modules through valgrind, but I didn't find any memory leaks (as I'm barely using any dynamic allocation ).
I thought that maybe the mutexes are causing some trouble ( as they are being locked/unlocked about 200 times a second ) so I switched my simple mutex class:
class AGMutex {
public:
AGMutex( void ) {
pthread_mutex_init( &mutex1, NULL );
}
~AGMutex( void ) {
pthread_mutex_destroy( &mutex1 );
}
void lock( void ) {
pthread_mutex_lock( &mutex1 );
}
void unlock( void ) {
pthread_mutex_unlock( &mutex1 );
}
private:
pthread_mutex_t mutex1;
};
to a dummy mutex class:
class AGMutex {
public:
AGMutex( void ) : mutex1( false ) {
}
~AGMutex( void ) {
}
volatile void lock( void ) {
if ( mutex1 ) {
while ( mutex1 ) {
usleep( 1 );
}
}
mutex1 = true;
}
volatile void unlock( void ) {
mutex1 = false;
}
private:
volatile bool mutex1;
};
but it changed nothing and the backtrace looks the same...
After some oldchool put-cout-between-every-line-and-see-where-it-segfaults-plus-remember-the-pids-and-stuff debugging session it seems that it segfaults during usleep (?).
I have no idea what else could be wrong. It can work for an hour or so, and then suddenly segfault for no apparent reason.
Has anybody ever encountered a similar problem?
From my answer to How to generate a stacktrace when my gcc C++ app crashes:
The first two entries in the stack frame chain when you get into the
signal handler contain a return address inside the signal handler and
one inside sigaction() in libc. The stack frame of the last function
called before the signal (which is the location of the fault) is lost.
This may explain why you are having difficulties determining the location of your segfault via a backtrace from a signal handler. My answer also includes a workaround for this limitation.
If you want to see how your application actually is laid out in memory (i.e. 0x80..... addresses), you should be able to generate a map file from gcc. This typically done via -Wl,-Map,output.map, which passes -Map output.map to the linker.
You may also have a hardware-specific version of objdump or nm with your toolchain/cross-toolchain that may be helpful in deciphering your 0x80..... addresses.
Do you have access to Helgrind on your platform? It's a Valgrind tool for detecting POSIX thread errors such as races and threads holding mutexes when they exit.
Related
I have a C++ Windows (compiled with Visual Studio 2019) program that uses shared libraries. A shared library uses a singleton on a class that creates a thread. The class destructor kills the thread cleanly, so there should be no memory leak. However, I see that the destructor is being invoked after the system actually killed all running threads upon exit, so it's too late, the thread is not exited cleanly and this introduces a memory leak (and possibly other problems depending on the code being processed by the thread).
Here is a MCVE:
#include <thread>
#include <atomic>
class Single
{
public:
static Single& GetInstance()
{
static Single single;
return single;
}
int doSomething()
{
while ( !started )
std::this_thread::sleep_for( std::chrono::milliseconds(100) );
return 0;
}
private:
Single() :
started( false ),
continueThread( true )
{
thread = new std::thread( &Single::threadFunc, this );
}
~Single()
{
continueThread = false;
thread->join();
delete thread;
}
void threadFunc()
{
started = true;
while ( continueThread )
{
std::this_thread::sleep_for( std::chrono::milliseconds(1) );
}
}
std::atomic_bool started;
std::atomic_bool continueThread;
std::thread* thread;
};
int main( int argc, char* argv[] )
{
return Single::GetInstance().doSomething();
}
If this is copied to a single main.cpp file and executed, everything works fine. When ~Single is executed, in the debugger, I see the threadFunc thread is running and it gets stopped cleanly.
Now, if Single definition and implementation is moved to a separate dll. When ~Single is executed, in the debugger, I see the threadFunc thread is not running anymore (the system already stopped it) and the code can't stop in cleanly. Visual Leak Detector reports then a memory leak.
Is there any flag (in code or at compiler level) that could be set to guarantee threads are not destroyed before the singleton gets deleted?
I know I could call a deinit function manually from the main function, but at some point, the main may not even know there is singleton running a thread in the shared library it uses...the shared library itself should be able to cleanly exit.
No.
Multithreading, automatic cleanup, and DLL unloading are basically a huge mess on Windows once they interact.
The solution is to not have singletons, or any static lifetime variables (globals, local statics, class statics) with non-trivial destruction semantics. Make an instance of your thing in main()/WinMain(). Pass a reference to whoever needs it. Let the destructor clean it up before main exits and thus before everything gets unloaded.
Or simply ignore the memory leak. The process is exiting anyway.
This a common case of SUOF (Static Unitialization Order Fiasco) caused by giving up control over object instance lifetime by using static local variable. Solution is to get back control over object instance lifetime by adding a couple of initialization / uninitialization routines (probably wrapped with RAII) that will ensure that object is created before the first use and destroyed after last use but prior to dll getting unloaded / main function returning.
I have done some searches on here, MSDN, and through some other forums via Google trying to find any sort of solution to this, but so far am stuck.
I have been looking for a week, trying to track down an access violation error in my C++ Program. I cant really post code here as it is under some IP Restrictions, but basically, it is a loop that is running roughly every 100ms reading bytes from a TCP Connection and placing them onto the back of a std::queue.
After I notice a particular byte sequence come through, I then remove x bytes from the queue and handle them as a message defined in an internal protocol.
What happens is, somewhere inside my application, the queue is becoming corrupted and crashing the application. So pair that with the fact that it is an access violation, it must be a dodgy pointer somewhere.
I have tried to use the VS2005 Debugger and Windbg to find it, I had call stacks to look at but it wasnt much help. All I could work out from it is that the cause is corruption of my internal queue. The reason it crashes is because the header of the message gets send to be parsed, but because it is corrupted everything falls over.
Then I tried Intel Thread Checker but that is far too slow to use in this application, as my program is part of a synchronous multi-threaded system.
Sometimes it will run for 300 reads... sometimes it can do 5000 reads... sometimes it can do 10000 reads before it crashes.
What are some other routes of diagnosis I can try? Am I missing something simple here that I should have checked already? From what I can see, anything being newed has a matching delete, and I am using Boost Librarys for Shared Pointers and Auto Pointers on long-living objects.
Use SEH(structured exception handling) to find out which part raises AV.
SEH in C++ example code from MSDN.
#include <stdio.h>
#include <windows.h>
#include <eh.h>
void SEFunc();
void trans_func( unsigned int, EXCEPTION_POINTERS* );
class SE_Exception
{
private:
unsigned int nSE;
public:
SE_Exception() {}
SE_Exception( unsigned int n ) : nSE( n ) {}
~SE_Exception() {}
unsigned int getSeNumber() { return nSE; }
};
int main( void )
{
try
{
_set_se_translator( trans_func );
SEFunc();
}
catch( SE_Exception e )
{
printf( "Caught a __try exception with SE_Exception.\n" );
}
}
void SEFunc()
{
__try
{
int x, y=0;
x = 5 / y;
}
__finally
{
printf( "In finally\n" );
}
}
void trans_func( unsigned int u, EXCEPTION_POINTERS* pExp )
{
printf( "In trans_func.\n" );
throw SE_Exception();
}
Random crash usually caused by heap corruption, it is hard to find. Past years I had deal with several heap corruption problems, as I remembered, one of the problems took me a whole weekend to track it down. Here're some suggestions:
Try app verifier first. details is in:
http://msdn.microsoft.com/en-us/library/windows/desktop/dd371695(v=vs.85).aspx
.
Gflags:
http://msdn.microsoft.com/en-us/library/windows/hardware/ff549557(v=vs.85).aspx.
Use it to to enable Page heap verification.
The solution 1 and 2 are both using heap verification for your whole
program, so you may get many exceptions and slow down your program,
but some of them are not related to your problem. If you know which
part of code has errors, you can use window API _CrtSetDbgFlag to
enable heap verifciation, some thing like this:
`int tmpFlag = _CrtSetDbgFlag( _CRTDBG_REPORT_FLAG );
tmpFlag |= _CRTDBG_CHECK_ALWAYS_DF;
_CrtSetDbgFlag(tmpFlag); // verify heap when alloc and dealloc
//you code here, if the heap is corrupt, exception will be thrown at next allocation.
tmpFlag |= ~_CRTDBG_CHECK_ALWAYS_DF;
_CrtSetDbgFlag(tmpFlag)// do not verify heap`
I am doing debug for a (pthread) multithread C++ program on Linux.
It works well when thread number is small such as 1, 2,3.
When thread number is increased, I got SIGSEGV (segmentation fault , UNIX signal 11).
But, the error sometimes appear and sometimes disappear when I increase thread number above 4.
I used valgrind, I got
==29655== Process terminating with default action of signal 11 (SIGSEGV)
==29655== Access not within mapped region at address 0xFFFFFFFFFFFFFFF8
==29655== at 0x3AEB69CA3E: std::string::assign(std::string const&) (in /usr/lib64/libstdc++.so.6.0.8)
==29655== by 0x42A93C: bufferType::getSenderID(std::string&) const (boundedBuffer.hpp:29)
It seems that my code tried to read a memory which is not allocated.
But, I cannot find any bugs in the function getSenderID(). It only return a string of a member data in Class bufferType. It has been initialized.
I used GDB and DDD (GDB GUI) to find the bug , which also points there but the error sometimes disappear so that in GDB, I cannot capture it with breakpoint.
Moreover, I also print out values of the function pointed by valgrind, but it is not helpful because multiple threads print out results with different orders and they interleave with each other. Each time I run the code, the print-output is different.
The bufferType is in a map, the map may have multiple entries. Each entry can be written by one thread and read by another thread at the same time. I have used pthread read/write lock to lock a pthread_rwlock_t. Now, there is no SIGSEGV but the program stops in some point without progress. I think this is a deadlock. But, one map entry can only be written by only one thread at one time point, why still have deadlock ?
Would you please recommend some methods to capture the bug so that I can find it no matter how many threads I use to run the code.
thanks
The code of boundedBuffer.hpp is as follows:
class bufferType
{
private:
string senderID;// who write the buffer
string recvID; // who should read the buffer
string arcID; // which arc is updated
double price; // write node's price
double arcValue; // this arc flow value
bool updateFlag ;
double arcCost;
int arcFlowUpBound;
//boost::mutex senderIDMutex;
//pthread_mutex_t senderIDMutex;
pthread_rwlock_t senderIDrwlock;
pthread_rwlock_t setUpdateFlaglock;
public:
//typedef boost::mutex::scoped_lock lock; // synchronous read / write
bufferType(){}
void getPrice(double& myPrice ) const {myPrice = price;}
void getArcValue(double& myArcValue ) const {myArcValue = arcValue;}
void setPrice(double& myPrice){price = myPrice;}
void setArcValue(double& myValue ){arcValue = myValue;}
void readBuffer(double& myPrice, double& myArcValue );
void writeBuffer(double& myPrice, double& myArcValue );
void getSenderID(string& myID)
{
//boost::mutex::scoped_lock lock(senderIDMutex);
//pthread_rwlock_rdlock(&senderIDrwlock);
cout << "senderID is " << senderID << endl ;
myID = senderID;
//pthread_rwlock_unlock(&senderIDrwlock);
}
//void setSenderID(string& myID){ senderID = myID ;}
void setSenderID(string& myID)
{
pthread_rwlock_wrlock(&senderIDrwlock);
senderID = myID ;
pthread_rwlock_unlock(&senderIDrwlock);
}
void getRecvID(string& myID) const {myID = recvID;}
void setRecvID(string& myID){ recvID = myID ;}
void getArcID(string& myID) const {myID = arcID ;}
void setArcID(string& myID){arcID = myID ;}
void getUpdateFlag(bool& myFlag)
{
myFlag = updateFlag ;
if (updateFlag)
updateFlag = false;
}
//void setUpdateFlag(bool myFlag){ updateFlag = myFlag ;}
void setUpdateFlag(bool myFlag)
{
pthread_rwlock_wrlock(&setUpdateFlaglock);
updateFlag = myFlag ;
pthread_rwlock_unlock(&setUpdateFlaglock);
}
void getArcCost(double& myc) const {myc = arcCost; }
void setArcCost(double& myc){ arcCost = myc ;}
void setArcFlowUpBound(int& myu){ arcFlowUpBound = myu ;}
int getArcFlowUpBound(){ return arcFlowUpBound ;}
//double getLastPrice() const {return price; }
} ;
From the code, you can see that I have tried to use read/write lock to assure invariant.
Each entry in map has a buffer like this above. Now, I have got deadlock.
Access not within mapped region at address 0xFFFFFFFFFFFFFFF8
at 0x3AEB69CA3E: std::string::assign(std::string const&)
This would normally mean that you are assigning to a string* that was NULL, and then got decremented. Example:
#include <string>
int main()
{
std::string *s = NULL;
--s;
s->assign("abc");
}
g++ -g t.cc && valgrind -q ./a.out
...
==20980== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==20980== Access not within mapped region at address 0xFFFFFFFFFFFFFFF8
==20980== at 0x4EDCBE6: std::string::assign(char const*, unsigned long)
==20980== by 0x400659: main (/tmp/t.cc:8)
...
So show us the code in boundedBuffer.hpp (with line numbers), and think how that code could end up with a string pointer that points at -8.
Would you please recommend some methods to capture the bug so that I can find it no matter how many threads I use to run the code.
When thinking about multi-threaded programs, you must think about invariants. You should put assertions to confirm that your invariants do hold. You should think how they might be violated, and what violations would cause the post-mortem state you have observed.
Do you have any cases where an object (such as a string) is accessed in one thread while another thread is, or might be, modifying it? That's the usual cause of a problem like this.
Look at your instance of bufferType.
When was it instantiated?
If it was instantiated before threads were spawned, and then one of the threads modified it, you have a race condition without a lock.
Also, watch out for any static variables anywhere near or inside that bufferType.
From the looks of it, one of the threads probably has modified the member that returned by getSenderID().
If none of these problems are causing your error, try using valgrind's drd.
My program is randomly crashing in a small scenario I can reproduce, but it happens in mlock.c (which is a VC++ runtime file) from ntdll.dll, and I can't see the stack trace. I do know that it happens in one of my thread functions, though.
This is the mlock.c code where the program crashes:
void __cdecl _unlock (
int locknum
)
{
/*
* leave the critical section.
*/
LeaveCriticalSection( _locktable[locknum].lock );
}
The error is "invalid handle specified". If I look at locknum, it's a number larger than _locktable's size, so this makes some sense.
This seems to be related to Critical Section usage. I do use CRITICAL_SECTIONS in my thread, via a CCriticalSection wrapper class and its associated RAII guard, CGuard. Definitions for both here to avoid even more clutter.
This is the thread function that's crashing:
unsigned int __stdcall CPlayBack::timerThread( void * pParams ) {
#ifdef _DEBUG
DRA::CommonCpp::SetThreadName( -1, "CPlayBack::timerThread" );
#endif
CPlayBack * pThis = static_cast<CPlayBack*>( pParams );
bool bContinue = true;
while( bContinue ) {
float m_fActualFrameRate = pThis->m_fFrameRate * pThis->m_fFrameRateMultiplier;
if( m_fActualFrameRate != 0 && pThis->m_bIsPlaying ) {
bContinue = ( ::WaitForSingleObject( pThis->m_hEndThreadEvent, static_cast<DWORD>( 1000.0f / m_fActualFrameRate ) ) == WAIT_TIMEOUT );
CImage img;
if( pThis->m_bIsPlaying && pThis->nextFrame( img ) )
pThis->sendImage( img );
}
else
bContinue = ( ::WaitForSingleObject( pThis->m_hEndThreadEvent, 10 ) == WAIT_TIMEOUT );
}
::GetErrorLoggerInstance()->Log( LOG_TYPE_NOTE, "CPlayBack", "timerThread", "Exiting thread" );
return 0;
}
Where does CCriticalSection come in? Every CImage object contains a CCriticalSection object which it uses through a CGuard RAII lock. Moreover, every CImage contains a CSharedMemory object which implements reference counting. To that end, it contains two CCriticalSection's as well, one for the data and one for the reference counter. A good example of these interactions is best seen in the destructors:
CImage::~CImage() {
CGuard guard(m_csData);
if( m_pSharedMemory != NULL ) {
m_pSharedMemory->decrementUse();
if( !m_pSharedMemory->isBeingUsed() ){
delete m_pSharedMemory;
m_pSharedMemory = NULL;
}
}
m_cProperties.ClearMin();
m_cProperties.ClearMax();
m_cProperties.ClearMode();
}
CSharedMemory::~CSharedMemory() {
CGuard guardUse( m_cs );
if( m_pData && m_bCanDelete ){
delete []m_pData;
}
m_use = 0;
m_pData = NULL;
}
Anyone bumped into this kind of error? Any suggestion?
Edit: I got to see some call stack: the call comes from ~CSharedMemory. So there must be some race condition there
Edit: More CSharedMemory code here
The "invalid handle specified" return code paints a pretty clear picture that your critical section object has been deallocated; assuming of course that it was allocated properly to begin with.
Your RAII class seems like a likely culprit. If you take a step back and think about it, your RAII class violates the Sepration Of Concerns principle, because it has two jobs:
It provides allocate/destroy semantics for the CRITICAL_SECTION
It provides acquire/release semantics for the CRITICAL_SECTION
Most implementations of a CS wrapper I have seen violate the SoC principle in the same way, but it can be problematic. Especially when you have to start passing around instances of the class in order to get to the acquire/release functionality. Consider a simple, contrived example in psudocode:
void WorkerThreadProc(CCriticalSection cs)
{
cs.Enter();
// MAGIC HAPPENS
cs.Leave();
}
int main()
{
CCriticalSection my_cs;
std::vector<NeatStuff> stuff_used_by_multiple_threads;
// Create 3 threads, passing the entry point "WorkerThreadProc"
for( int i = 0; i < 3; ++i )
CreateThread(... &WorkerThreadProc, my_cs);
// Join the 3 threads...
wait();
}
The problem here is CCriticalSection is passed by value, so the destructor is called 4 times. Each time the destructor is called, the CRITICAL_SECTION is deallocated. The first time works fine, but now it's gone.
You could kludge around this problem by passing references or pointers to the critical section class, but then you muddy the semantic waters with ownership issues. What if the thread that "owns" the crit sec dies before the other threads? You could use a shared_ptr, but now nobody really "owns" the critical section, and you have given up a little control in on area in order to gain a little in another area.
The true "fix" for this problem is to seperate concerns. Have one class for allocation & deallocation:
class CCriticalSection : public CRITICAL_SECTION
{
public:
CCriticalSection(){ InitializeCriticalSection(this); }
~CCriticalSection() { DestroyCriticalSection(this); }
};
...and another to handle locking & unlocking...
class CSLock
{
public:
CSLock(CRITICAL_SECTION& cs) : cs_(cs) { EnterCriticalSection(&cs_); }
~CSLock() { LeaveCriticalSection(&cs_); }
private:
CRITICAL_SECTION& cs_;
};
Now you can pass around raw pointers or references to a single CCriticalSection object, possibly const, and have the worker threads instantiate their own CSLocks on it. The CSLock is owned by the thread that created it, which is as it should be, but ownership of the CCriticalSection is clearly retained by some controlling thread; also a good thing.
Make sure Critical Section object is not in #pragma packing 1 (or any non-default packing).
Ensure that no other thread (or same thread) is corrupting the CS object. Run some static analysis tool to check for any buffer overrun problem.
If you have runtime analysis tool, do run it to find the issue.
I decided to adhere to the KISS principle and rock and roll all nite simplify things. I figured I'd replace the CSharedMemoryClass with a std::tr1::shared_ptr<BYTE> and a CCriticalSection which protects it from concurrent access. Both are members of CImage now, and concerns are better separated now, IMHO.
That solved the weird critical section, but now it seems I have a memory leak caused by std::tr1::shared_ptr, you might see me post about it soon... It never ends!
I'm attempting to run a part of my program in a thread and getting an unusual result.
I have updated this question with the results of the changes suggested by Remus, but as I am still getting an error, I feel the question is still open.
I have implemented functionality in a dll to tie into a piece of vendor software. Everything works until I attempt to create a thread inside this dll.
Here is the relevant section of the DLL:
extern "C" {
__declspec(dllexport) void __cdecl ccEntryOnEvent(WORD event);
}
to define the function the vendor's software calls, then:
using namespace std;
HANDLE LEETT_Thread = NULL;
static bool run_LEETT = true;
unsigned threadID;
void *lpParam;
int RunLEETTThread ( void ) {
LEETT_Thread = (HANDLE)_beginthreadex( NULL, 0, LEETT_Main, lpParam, 0 , &threadID );
//LEETT_Thread = CreateThread ( NULL, 0, LEETT_Main, lpParam, 0 , NULL );
if ( LEETT_Thread == NULL )
ErrorExit ( _T("Unable to start translator thread") );
run_LEETT = false; // We only wish to create the thread a single time.
return 0;
}
extern "C" void __cdecl ccEntryOnEvent(WORD event ) {
switch (event) {
case E_START:
if ( run_LEETT ) {
RunLEETTThread ();
MessageText ( "Running LEETT Thread" );
}
break;
}
WaitForSingleObject( LEETT_Thread ,INFINITE);
return;
}
The function is declared as
unsigned __stdcall LEETT_Main ( void* lpParam ) {
LEETT_Main is about 136k when compiled as a stand alone executable with no optimization (I have a separate file with a main() in it that calls the same function as myFunc).
Prior to changing the way the thread is called, the program would crash when declaring a structure containing a std::list, shown here:
struct stateFlags {
bool inComment; // multiline comments bypass parsing, but not line numbering
// Line preconditions
bool MCodeSeen; // only 1 m code per block allowed
bool GCodeSeen; // only 1 g code per block allowed
std::list <int> gotos; // a list of the destination line numbers
};
It now crashes on the _beginthreadex command, tracing through shows this
/*
* Allocate and initialize a per-thread data structure for the to-
* be-created thread.
*/
if ( (ptd = (_ptiddata)_calloc_crt(1, sizeof(struct _tiddata))) == NULL )
goto error_return;
Tracing through this I saw a error 252 (bad ptr) and ultimately 255 (runtime error).
I'm wondering if anyone has encountered this sort of behaviour creating threads (in dlls?) and what the remedy might be. When I create an instance of this structure in my toy program, there was no issue. When I removed the list variable the program simply crashed elsewhere, on the declaration of a string
I'm very open to suggestions at this point, if I have to I'll remove the idea of threading for now, though it's not particularly practical.
Thanks, especially to those reading this over again :)
Threads that use CRT (and std::list implies CRT) need to be created with _beginthreadex, as documented on MSDN:
A thread in an executable that calls the C run-time library (CRT)
should use the _beginthreadex and _endthreadex functions for thread
management rather than CreateThread and ExitThread;
Is not clear how you start your thread, but it appears that you're doing it in DllMain which is not recommended (see Does creating a thread from DllMain deadlock or doesn't it?).
In rechecking the comments here and the configuration of the project, the vendor supplied solution file uses /MTd for debug, but we are building a DLL, so I needed to use /MDd, which immediately compiles and runs correctly.
Sorry about the ridiculous head scratcher...