Protecting class memory array to detect segmentation fault - c++

I am on Linux (CentOS 7.4, compiler Clang) and getting a segmentation fault (not easily reproducible) within a C++ struct object. This is a class member of a polymorphic object I do not allocate, but is instantiated within a framework I do not have the source code to. This means I cannot compile using sanitize easily and Valgrind increases the initialisation time from seconds to 5 minutes):
// C is allocated within a third party framework, I assume they use new()
//
class C : public ThirdPartyParentClass
{
S s;
}
struct S
{
.
std::mutex _mutex;
.
};
the segmentation fault corrupts _mutex.
I therefore added a char buffer so I could see the corruption:
struct S
{
.
char _buffer[1000];
std::mutex _mutex;
.
};
and I can see the corrupted bytes when the segmentation fault occurs. However, I cannot determine when the corruption takes places.
To determine when the corruption takes place I would like to protect the char buffer bytes. I tried:
struct S
{
S()
{
mprotect(&_buffer[0], 4096, PROT_NONE);
const int test = buffer[0]; // Trigger seg fault to test it works
}
.
char _buffer[4096]__attribute__((aligned(4096)));
std::mutex _mutex;
.
};
but my test to determine the memory protection is working, doesn't cause a seg fault.
Could somebody please help?

Doing this at the source level is a bit silly. If you want to find the exact moment when something gets written to a particular memory address, use a data breakpoint (gcc calls them "watchpoints"). Just do watch *(int*)0xWHATEVER on the area of memory you expect to be corrupted, and it'll break on the first modification, with very low overhead.

Related

Insert into C++ map leads to infinite recursion?

I'm trying to write some code that will allow me to insert into a C++ map from an extern "C" function. The code is as follows:
class CFITracing {
std::unordered_map<uintptr_t, uintptr_t> CallerCalleePairs;
std::map<std::string, int> BranchResults;
public:
void HandleCallerCallee(uintptr_t Caller, uintptr_t Callee);
void HandleBranchResult(int cond, char* branchName);
void printResults();
};
...
void CFITracing::HandleBranchResult(int cond, char* branchName) {
std::string branchStr(branchName);
printf("%s\n", "pre");
/* segfault at this line, regardless of what string I use as a key (even "hi") */
BranchResults[branchStr] = cond;
printf("%s\n", "success");
}
CFITracing CFIT;
__attribute__((used))
__attribute__((optnone))
extern "C" void __trace(int cond, char* branchName) {
CFIT.HandleBranchResult(cond, branchName);
}
Calls to the __trace function are inserted into the binary via an LLVM pass I've written, which passes an int and char* to my code above.
On calls to __trace, "pre" is printed repeatedly until a segmentation fault occurs. GDB shows that, when the line with the map insert occurs, the code somehow loops and this line is called repeatedly until the segfault occurs.
When debugging via valgrind, the following error occurs:
==20881== Stack overflow in thread #1: can't grow stack to 0x1ffe801000
==20881==
==20881== Process terminating with default action of signal 11 (SIGSEGV)
==20881== Access not within mapped region at address 0x1FFE801FF8
==20881== Stack overflow in thread #1: can't grow stack to 0x1ffe801000
==20881== at 0x58069A2: _IO_file_xsputn##GLIBC_2.2.5 (fileops.c:1220)
The presence of a stack overflow, together with the repeated prints, make me think infinite recursion has been triggered. I think it's likely I've created some type of undefined behavior, but I'm not sure exactly how to write this code to prevent it, given that I need to insert into the map via this extern "C" __trace() function.
Thus, my questions are as follows: is there anything quick I can fix to prevent this behavior from taking place? If not, how should I aim to redesign this, given that I need to insert into the map via the extern "C" function? Thanks for your help!

Changing the order of member definition corrupts memory

I've got a class with some member classes. Everything works just fine until I'm changing the order in which I'm defining the members.
This is the code which works:
class Core{
public:
/// Initialization and main loop.
bool initCore(int argc, char *argv[], string dirPath);
int mainLoop();
/// Network impulse message
static void networkImpulseCallback(unsigned char* data, int length);
private:
bool _running;
string _directoryPath;
string _serverIP;
string _loginName;
string _pass;
string _myShowName;
/// System components.
ConfigurationFile _config;
ResourceParser _resourceParser;
Graphics _graphics;
Input _input;
ScriptInterpreter _scriptInt;
GUISystem _guiSystem;
EntityManager _entityManager;
/// Component threads.
thread *_netLoop;
thread *_graphLoop;
thread *_scriptLoop;
thread *_animationLoop;
thread *_physicsLoop;
/*
* Initializes the rest of the system after the login.
*/
bool _initPostLogin();
}
If I now put, for example, the Graphics object under the Input object, I will get a access violation. Where that violation happens, depends on which object I'm moving. I tried to figure out which object is causing the error by moving the objects around, but unfortunately got no result. The members of the object, in which the violation happens, are all uninitialized(for example a vector or a mutex).
Now, my guess is that somewhere the memory gets corrupted. If so, what is the best way to locate the bug? If not, where could the problem be?
Do you have some buffer inside Input class? I'd guess you write past the end and corrupt next member (Graphics).
Did you allocate memory for your pointers to point at?
Is a library deallocating objects before you access them?

ExtAudioFileOpenURL leak

I am opening an audio file to read it and I get an abandoned malloc block from this caller each time.
In a loop I set data like this (which is marked as the memory usage in instruments as 99.7%) data = (short*)malloc(kSegmentSize*sizeof(short));
and free it like this free(data); at the end of each iteration.
Im not really sure what is happening here and would appreciate any help.
EDIT: KSegmentSize varies in the thousands, from minimum 6000 - max 50000 (speculative)
Instruments trace:
Not having the exact code:
Pretty sure you're having this problem b/c something between the malloc and free is throwing (and you're probably catching it already so you don't exit the loop). Depending on if this is happening in C (or objective-C) or C++ code, you have slightly different methods of resolution.
In C++, wrap the malloc/free in the RAII pattern so that when the stack is unwound the free is called.
class MyData {
public:
A(size_t numShorts) : dataPtr(0) { dataPtr = malloc(numShorts * sizeof(short)); }
~A() { free(dataPtr); }
operator short*() { return dataPtr; }
private:
short* dataPtr;
}
MyData data(numShorts);
// do your stuff, you can still use data as you were before due the 'operator short*'
// allow the dtor to be called when you go out of scope
In Objective-C you need to use a finally block:
void* myPtr = 0;
#try { myPtr = malloc(...); }
#catch {}
#finally { free(myPtr); }
Suggest that you start by simplifying, for example comment out (preferably using #if 0) all of the code except the malloc/free. Run the code and ensure no abandoned heap blocks. Then gradually re-introduce the remaining code and re-run until you hit the problem, then debug.
Sorry to answer my own question, but after commenting out code back up the stack trace the actual issue was to do with the file not be disposed.
Calling ExtAudioFileDispose(audioFile); solved this hidden bug. Instruments was not entirely clear and marked mallocs as the leak. To be fair the mallocs where from data that was within the file referenced by the ExtAudioOpenFile method, not disposing the file reference left a leak.

What are some useful tips to track down a randomly occuring C0000005 Access Violation in C++?

I have done some searches on here, MSDN, and through some other forums via Google trying to find any sort of solution to this, but so far am stuck.
I have been looking for a week, trying to track down an access violation error in my C++ Program. I cant really post code here as it is under some IP Restrictions, but basically, it is a loop that is running roughly every 100ms reading bytes from a TCP Connection and placing them onto the back of a std::queue.
After I notice a particular byte sequence come through, I then remove x bytes from the queue and handle them as a message defined in an internal protocol.
What happens is, somewhere inside my application, the queue is becoming corrupted and crashing the application. So pair that with the fact that it is an access violation, it must be a dodgy pointer somewhere.
I have tried to use the VS2005 Debugger and Windbg to find it, I had call stacks to look at but it wasnt much help. All I could work out from it is that the cause is corruption of my internal queue. The reason it crashes is because the header of the message gets send to be parsed, but because it is corrupted everything falls over.
Then I tried Intel Thread Checker but that is far too slow to use in this application, as my program is part of a synchronous multi-threaded system.
Sometimes it will run for 300 reads... sometimes it can do 5000 reads... sometimes it can do 10000 reads before it crashes.
What are some other routes of diagnosis I can try? Am I missing something simple here that I should have checked already? From what I can see, anything being newed has a matching delete, and I am using Boost Librarys for Shared Pointers and Auto Pointers on long-living objects.
Use SEH(structured exception handling) to find out which part raises AV.
SEH in C++ example code from MSDN.
#include <stdio.h>
#include <windows.h>
#include <eh.h>
void SEFunc();
void trans_func( unsigned int, EXCEPTION_POINTERS* );
class SE_Exception
{
private:
unsigned int nSE;
public:
SE_Exception() {}
SE_Exception( unsigned int n ) : nSE( n ) {}
~SE_Exception() {}
unsigned int getSeNumber() { return nSE; }
};
int main( void )
{
try
{
_set_se_translator( trans_func );
SEFunc();
}
catch( SE_Exception e )
{
printf( "Caught a __try exception with SE_Exception.\n" );
}
}
void SEFunc()
{
__try
{
int x, y=0;
x = 5 / y;
}
__finally
{
printf( "In finally\n" );
}
}
void trans_func( unsigned int u, EXCEPTION_POINTERS* pExp )
{
printf( "In trans_func.\n" );
throw SE_Exception();
}
Random crash usually caused by heap corruption, it is hard to find. Past years I had deal with several heap corruption problems, as I remembered, one of the problems took me a whole weekend to track it down. Here're some suggestions:
Try app verifier first. details is in:
http://msdn.microsoft.com/en-us/library/windows/desktop/dd371695(v=vs.85).aspx
.
Gflags:
http://msdn.microsoft.com/en-us/library/windows/hardware/ff549557(v=vs.85).aspx.
Use it to to enable Page heap verification.
The solution 1 and 2 are both using heap verification for your whole
program, so you may get many exceptions and slow down your program,
but some of them are not related to your problem. If you know which
part of code has errors, you can use window API _CrtSetDbgFlag to
enable heap verifciation, some thing like this:
`int tmpFlag = _CrtSetDbgFlag( _CRTDBG_REPORT_FLAG );
tmpFlag |= _CRTDBG_CHECK_ALWAYS_DF;
_CrtSetDbgFlag(tmpFlag); // verify heap when alloc and dealloc
//you code here, if the heap is corrupt, exception will be thrown at next allocation.
tmpFlag |= ~_CRTDBG_CHECK_ALWAYS_DF;
_CrtSetDbgFlag(tmpFlag)// do not verify heap`

How to find a (segmentation fault) bug in C++ (pthread) multithread program on linux?

I am doing debug for a (pthread) multithread C++ program on Linux.
It works well when thread number is small such as 1, 2,3.
When thread number is increased, I got SIGSEGV (segmentation fault , UNIX signal 11).
But, the error sometimes appear and sometimes disappear when I increase thread number above 4.
I used valgrind, I got
==29655== Process terminating with default action of signal 11 (SIGSEGV)
==29655== Access not within mapped region at address 0xFFFFFFFFFFFFFFF8
==29655== at 0x3AEB69CA3E: std::string::assign(std::string const&) (in /usr/lib64/libstdc++.so.6.0.8)
==29655== by 0x42A93C: bufferType::getSenderID(std::string&) const (boundedBuffer.hpp:29)
It seems that my code tried to read a memory which is not allocated.
But, I cannot find any bugs in the function getSenderID(). It only return a string of a member data in Class bufferType. It has been initialized.
I used GDB and DDD (GDB GUI) to find the bug , which also points there but the error sometimes disappear so that in GDB, I cannot capture it with breakpoint.
Moreover, I also print out values of the function pointed by valgrind, but it is not helpful because multiple threads print out results with different orders and they interleave with each other. Each time I run the code, the print-output is different.
The bufferType is in a map, the map may have multiple entries. Each entry can be written by one thread and read by another thread at the same time. I have used pthread read/write lock to lock a pthread_rwlock_t. Now, there is no SIGSEGV but the program stops in some point without progress. I think this is a deadlock. But, one map entry can only be written by only one thread at one time point, why still have deadlock ?
Would you please recommend some methods to capture the bug so that I can find it no matter how many threads I use to run the code.
thanks
The code of boundedBuffer.hpp is as follows:
class bufferType
{
private:
string senderID;// who write the buffer
string recvID; // who should read the buffer
string arcID; // which arc is updated
double price; // write node's price
double arcValue; // this arc flow value
bool updateFlag ;
double arcCost;
int arcFlowUpBound;
//boost::mutex senderIDMutex;
//pthread_mutex_t senderIDMutex;
pthread_rwlock_t senderIDrwlock;
pthread_rwlock_t setUpdateFlaglock;
public:
//typedef boost::mutex::scoped_lock lock; // synchronous read / write
bufferType(){}
void getPrice(double& myPrice ) const {myPrice = price;}
void getArcValue(double& myArcValue ) const {myArcValue = arcValue;}
void setPrice(double& myPrice){price = myPrice;}
void setArcValue(double& myValue ){arcValue = myValue;}
void readBuffer(double& myPrice, double& myArcValue );
void writeBuffer(double& myPrice, double& myArcValue );
void getSenderID(string& myID)
{
//boost::mutex::scoped_lock lock(senderIDMutex);
//pthread_rwlock_rdlock(&senderIDrwlock);
cout << "senderID is " << senderID << endl ;
myID = senderID;
//pthread_rwlock_unlock(&senderIDrwlock);
}
//void setSenderID(string& myID){ senderID = myID ;}
void setSenderID(string& myID)
{
pthread_rwlock_wrlock(&senderIDrwlock);
senderID = myID ;
pthread_rwlock_unlock(&senderIDrwlock);
}
void getRecvID(string& myID) const {myID = recvID;}
void setRecvID(string& myID){ recvID = myID ;}
void getArcID(string& myID) const {myID = arcID ;}
void setArcID(string& myID){arcID = myID ;}
void getUpdateFlag(bool& myFlag)
{
myFlag = updateFlag ;
if (updateFlag)
updateFlag = false;
}
//void setUpdateFlag(bool myFlag){ updateFlag = myFlag ;}
void setUpdateFlag(bool myFlag)
{
pthread_rwlock_wrlock(&setUpdateFlaglock);
updateFlag = myFlag ;
pthread_rwlock_unlock(&setUpdateFlaglock);
}
void getArcCost(double& myc) const {myc = arcCost; }
void setArcCost(double& myc){ arcCost = myc ;}
void setArcFlowUpBound(int& myu){ arcFlowUpBound = myu ;}
int getArcFlowUpBound(){ return arcFlowUpBound ;}
//double getLastPrice() const {return price; }
} ;
From the code, you can see that I have tried to use read/write lock to assure invariant.
Each entry in map has a buffer like this above. Now, I have got deadlock.
Access not within mapped region at address 0xFFFFFFFFFFFFFFF8
at 0x3AEB69CA3E: std::string::assign(std::string const&)
This would normally mean that you are assigning to a string* that was NULL, and then got decremented. Example:
#include <string>
int main()
{
std::string *s = NULL;
--s;
s->assign("abc");
}
g++ -g t.cc && valgrind -q ./a.out
...
==20980== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==20980== Access not within mapped region at address 0xFFFFFFFFFFFFFFF8
==20980== at 0x4EDCBE6: std::string::assign(char const*, unsigned long)
==20980== by 0x400659: main (/tmp/t.cc:8)
...
So show us the code in boundedBuffer.hpp (with line numbers), and think how that code could end up with a string pointer that points at -8.
Would you please recommend some methods to capture the bug so that I can find it no matter how many threads I use to run the code.
When thinking about multi-threaded programs, you must think about invariants. You should put assertions to confirm that your invariants do hold. You should think how they might be violated, and what violations would cause the post-mortem state you have observed.
Do you have any cases where an object (such as a string) is accessed in one thread while another thread is, or might be, modifying it? That's the usual cause of a problem like this.
Look at your instance of bufferType.
When was it instantiated?
If it was instantiated before threads were spawned, and then one of the threads modified it, you have a race condition without a lock.
Also, watch out for any static variables anywhere near or inside that bufferType.
From the looks of it, one of the threads probably has modified the member that returned by getSenderID().
If none of these problems are causing your error, try using valgrind's drd.