How to investigate and fix libpjsua2.so crash - c++

SIGSEGV SEGV_MAPERR at 0x00000008
0 libpjsua2.so 0x56585a88 pj::Call::getInfo() const
1 libpjsua2.so 0x56546b44 std::allocator<pj::CallMediaInfo>::allocator()
I'm using pjsip for one of my hobby project(complies with GPL). Above you can see the stacktrace received from crashlytics. I'm using Java wrapper for pjsip.
There are a lot of users(50 %) affected by this error, however I'm not able to reproduce it on my local devices.
Not sure but I suspect that following java call lead to error. Which call C++ via JNI
public void notifyCallState(MyCall call) {
if (currentCall == null || call.getId() != currentCall.getId())
return;
CallInfo ci;
try {
ci = call.getInfo();
} catch (Exception e) {
ci = null;
}
Message m = Message.obtain(handler, MSG_TYPE.CALL_STATE, ci);
m.sendToTarget();
if (ci != null && ci.getState() == pjsip_inv_state.PJSIP_INV_STATE_DISCONNECTED) {
currentCall = null;
}
}
Code snippet is taken from examples which come from psjua download. Link to http repo. My code is the same. Any help highly appreciated

From the stacktrace is looks like call is null, and getId method is at 0x8 offset.
If that's really the case, the fix is to make sure notifyCallState isn't called with null argument, or to check it inside the method, i.e.:
if (call == null || currentCall == null || call.getId() != currentCall.getId())
return;

Your program is most likely hitting some sort of memory corruption and most likely heap memory. Following observations points towards that.
I'm not able to reproduce it on my local devices. This is common symptoms of memory corruption.
stack-trace includes std::allocator which indicates that program has been terminated while using(creating/deleting/accessing) the heap memory.
Recommendation
We should try to review the code logic and whether this program uses Interop service in correct way.I do not have much idea regarding this however it looks like your program logic does have JAVA/C++ interaction. If we are lucky we might get something obvious here and we are done.
If the stack-trace are after effect of something else, then we are in trouble we might have to take approach suggested in below posts.
Windows Platform
https://stackoverflow.com/a/22074401/2724703
Linux Platform
https://stackoverflow.com/a/22658693/2724703
Android Platform
https://stackoverflow.com/a/22663360/2724703
You may want to refer the above posts to get the idea about how to approach on such problems. As per my understanding, android platform does not have dynamic tools so you might have to use some versions(debug/additional logging) of your library.
I do hope that, above information might be useful and would have given some guidelines to approach your problem.

Related

Exception thrown: read access violation. std::shared_ptr<>::operator-><,0>(...)->**** was 0xFFFFFFFFFFFFFFE7

Good afternoon to all! I am writing a game engine using OpenGL + Win32 / GLFW. Therefore, I will say the project is large, but I have a problem that has led me to a dead end and I can't understand what the problem is. The problem is that I have a shared_ptr<Context> in the 'windows' class (it is platform-based) that is responsible for the context (GL, D3D). And everything works fine when I launch the application, everything is drawn normally, but when I start entering the cursor into the window, a crash occurs when any function calls from context, in my case is:
context->swapBuffers();
Here a crash:
std::shared_ptr<Context>::operator-><Context,0>(...)->**** was 0xFFFFFFFFFFFFFFE7.
Then I got into the callstack and saw that the context-> itself is non-zero, so the function should be called.
Then I searched for a long time for the problem and found that when I remove the Win32 message processing code, and when I remove it, errors do not occur when calling a function from context->. I removed everything unnecessary from the loop and left only the basic functions and tried to run it like this, because I thought that some other functions inside were causing this problem, but no.
while (PeekMessageW(&msg, NULL, NULL, NULL, PM_REMOVE) > 0) {
TranslateMessage(&msg);
DispatchMessageW(&msg);
}
That is, when I delete TranslateMessage() and DispatchMessage(), the error goes away. And then I finally got confused i.e I don't understand what is happening. And then I thought that maybe somehow the operating system itself affects that pointer, prohibits reading or something like that.
And then I paid attention to __vtptr in the call stack, and noticed that it is nullptr, and moreover it has the void** type. And also the strangest thing is that in the error I have ->**** was 0xffffffffffffffc7 as many as four consecutive pointers. What is it?
I understand I practically didn't throw off the code, because I have a big project and I think it doesn't make sense to throw everything, so I tried to explain the problem by roughly showing what is happening in my code. I will be grateful to everyone who will help :)

Can a process be limited on how much physical memory it uses?

I'm working on an application, which has the tendency to use excessive amounts of memory, so I'd like to reduce this.
I know this is possible for a Java program, by adding a Maximum heap size parameter during startup of the Java program (e.g. java.exe ... -Xmx4g), but here I'm dealing with an executable on a Windows-10 system, so this is not applicable.
The title of this post refers to this URL, which mentions a way to do this, but which also states:
Maximum Working Set. Indicates the maximum amount of working set assigned to the process. However, this number is ignored by Windows unless a hard limit has been configured for the process by a resource management application.
Meanwhile I can confirm that the following lines of code indeed don't have any impact on the memory usage of my program:
HANDLE h_jobObject = CreateJobObject(NULL, L"Jobobject");
if (!AssignProcessToJobObject(h_jobObject, OpenProcess(PROCESS_ALL_ACCESS, FALSE, GetCurrentProcessId())))
{
throw "COULD NOT ASSIGN SELF TO JOB OBJECT!:";
}
JOBOBJECT_EXTENDED_LIMIT_INFORMATION tagJobBase = { 0 };
tagJobBase.BasicLimitInformation.MaximumWorkingSetSize = 1; // far too small, just to see what happens
BOOL bSuc = SetInformationJobObject(h_jobObject, JobObjectExtendedLimitInformation, (LPVOID)&tagJobBase, sizeof(tagJobBase));
=> bSuc is true, or is there anything else I should expect?
In top of this, the mentioned tools (resource managed applications, like Hyper-V) seem not to work on my Windows-10 system.
Next to this, there seems to be another post about this subject "Is there any way to force the WorkingSet of a process to be 1GB in C++?", but here the results seem to be negative too.
For a good understanding: I'm working in C++, so the solution, proposed in this URL are not applicable.
So now I'm stuck with the simple question: is there a way, implementable in C++, to limit the memory usage of the current process, running on Windows-10?
Does anybody have an idea?
Thanks in advance

Boomerang decompiler fails

I need to decompile a windows program which the source code was lost for a long time.
I am using boomerang in Windows 7 for this. However, it looks broken, gives this message and quits:
Could not open dynamic loader library Win32BinaryFile.dll (error #998)
Googling about it gives no useful results. Looking in the boomerang source code, it is apparently coming from this:
00137 hModule = LoadLibraryA(libName.c_str());
00138 if(hModule == NULL) {
00139 int err = GetLastError();
00140 fprintf( stderr, "Could not open dynamic loader library %s (error #%d)\n", libName.c_str(), err);
00141 fclose(f);
00142 return NULL;
00143 }
I.e. LoadLibraryA is failing with the status 998.
What could I do to fix that?
Edit, four hours later:
The program that I want to decompile is a work that me and a friend implemented in 2005. The source just gone in the mean time without we seeing that. Now, in 2013, when we searched it, nothing was found. In retrospect, it was probably lost in 2008 or in 2010, two occasions where my computer hardware crashed and I needed to get a new computer (and lost a lot of data with that). We had several backups scattered in several places, but after an exhaustive search, I found nothing.
I know that since boomerang is open source, I could just get its source code and hack it around. However, that sort of task is not what I originally intended to do, since the focus is just to decompile my program and I guess that I am missing something simple, since it can't load the DLL while it is clearly there.
I don't need the exact code back, just a sketch of what were the exact details of the algorithm that were implemented. Having that, I can rewrite the rest again.

NtQuerySystemInformation Hook Failure

After successfully building a trampoline and learning more about process memory space, I tested the trampoline on MessageBoxA. It worked perfectly so I decided to finally put the code to the use it was supposed to be for in the first place, hiding a process by hooking NtQuerySystemInformation. The redirect function should work fine, but the code I used to write the jmp instruction now causes the task manager to crash every time.
BYTE tmpJMP[5] = {0xE9,0x00,0x00,0x00,0x00}; //jmp,A,D,D,R
memcpy(JMP,tmpJMP,5);
DWORD Addr = ((DWORD)func - ((DWORD)oNtQuerySystemInformation + 0x5));
for (int i=0;i<4;++i)
JMP[i+1] = ((BYTE*)&Addr)[i];
if (VirtualProtect((LPVOID)oNtQuerySystemInformation,5,PAGE_EXECUTE_READWRITE,&oldProtect) == FALSE)
MessageBox(NULL,L"Error unprotecting memory",L"",MB_OK);
memcpy(oldBytes,(LPVOID)oNtQuerySystemInformation,5);
if (!WriteProcessMemory(GetCurrentProcess(),(LPVOID)oNtQuerySystemInformation,(LPCVOID)JMP,5,NULL))
MessageBox(NULL,L"Unable to write to process memory space",L"",MB_OK);
VirtualProtect((LPVOID)oNtQuerySystemInformation,5,oldProtect,NULL);
FlushInstructionCache(GetCurrentProcess(),NULL,NULL);
I'm writing to the memory as such. I can't seem to find a problem with the code. I was thinking that maybe the memory changes from API to API but I was told that's incorrect, making me all out confused. Is there anything wrong that you all see? Please be descriptive. I love to learn o3o

Make main() "uncrashable"

I want to program a daemon-manager that takes care that all daemons are running, like so (simplified pseudocode):
void watchMe(filename)
{
while (true)
{
system(filename); //freezes as long as filename runs
//oh, filename must be crashed. Nevermind, will be restarted
}
}
int main()
{
_beginThread(watchMe, "foo.exe");
_beginThread(watchMe, "bar.exe");
}
This part is already working - but now I am facing the problem that when an observed application - say foo.exe - crashes, the corresponding system-call freezes until I confirm this beautiful message box:
This makes the daemon useless.
What I think might be a solution is to make the main() of the observed programs (which I control) "uncrashable" so they are shutting down gracefully without showing this ugly message box.
Like so:
try
{
char *p = NULL;
*p = 123; //nice null pointer exception
}
catch (...)
{
cout << "Caught Exception. Terminating gracefully" << endl;
return 0;
}
But this doesn't work as it still produces this error message:
("Untreated exception ... Write access violation ...")
I've tried SetUnhandledExceptionFilter and all other stuff, but without effect.
Any help would be highly appreciated.
Greets
This seems more like a SEH exception than a C++ exception, and needs to be handled differently, try the following code:
__try
{
char *p = NULL;
*p = 123; //nice null pointer exception
}
__except(GetExceptionCode() == EXCEPTION_ACCESS_VIOLATION ?
EXCEPTION_EXECUTE_HANDLER : EXCEPTION_CONTINUE_SEARCH)
{
cout << "Caught Exception. Terminating gracefully" << endl;
return 0;
}
But thats a remedy and not a cure, you might have better luck running the processes within a sandbox.
You can change the /EHsc to /EHa flag in your compiler command line (Properties/ C/C++ / Code Generation/ Enable C++ exceptions).
See this for a similar question on SO.
You can run the watched process a-synchronously, and use kernel objects to communicate with it. For instance, you can:
Create a named event.
Start the target process.
Wait on the created event
In the target process, when the crash is encountered, open the named event, and set it.
This way, your monitor will continue to run as soon as the crash is encountered in the watched process, even if the watched process has not ended yet.
BTW, you might be able to control the appearance of the first error message using drwtsn32 (or whatever is used in Win7), and I'm not sure, but the second error message might only appear in debug builds. Building in release mode might make it easier for you, though the most important thing, IMHO, is solving the cause of the crashes in the first place - which will be easier in debug builds.
I did this a long time ago (in the 90s, on NT4). I don't expect the principles to have changed.
The basic approach is once you have started the process to inject a DLL that duplicates the functionality of UnhandledExceptionFilter() from KERNEL32.DLL. Rummaging around my old code, I see that I patched GetProcAddress, LoadLibraryA, LoadLibraryW, LoadLibraryExA, LoadLibraryExW and UnhandledExceptionFilter.
The hooking of the LoadLibrary* functions dealt with making sure the patching was present for all modules. The revised GetProcAddress had provide addresses of the patched versions of the functions rather than the KERNEL32.DLL versions.
And, of course, the UnhandledExceptionFilter() replacement does what you want. For example, start a just in time debugger to take a process dump (core dumps are implemented in user mode on NT and successors) and then kill the process.
My implementation had the patched functions implemented with __declspec(naked), and dealt with all the registered by hand because the compiler can destroy the contents of some registers that callers from assembly might not expect to be destroyed.
Of course there was a bunch more detail, but that is the essential outline.