GDB internal error when using clone() - gdb

I'm working on Ubuntu 12.04 and debugging the below code using gdb. I get the following internal error whenever calling clone().
34 pid = clone(entryPt,(char*)pStack+stackSize,SIGCHLD|CLONE_FS|CLONE_FILES|CLONE_VM,&arg);
(gdb) n
/build/buildd/gdb-7.4-2012.04/gdb/linux-thread-db.c:418: internal-error: thread_get_info_callback: Assertion `inout->thread_info != NULL' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n)
Here is the code segment.
int taskSpawn(char* name, int priority, int option, int stackSize, FUNCPTR entryPt, int arg)
{
TblEntry *Task_Entry;
int Handle, q_idx, old_idx, old_priority;
void *pStack;
pid_t pid = 0; // to be modified.
struct _Task *new_TCB, *old_TCB;
/* Get a free entry */
Handle = _CreateHandle(&Task_Entry);
if( Handle == -1 )
return -1; // fail to spawn task.
pStack = malloc(stackSize);
// Call clone().
#ifdef __linux
pid = clone(entryPt,
(char*)pStack+stackSize,
SIGCHLD|CLONE_FS|CLONE_FILES|CLONE_VM,
&arg);
kill(pid, SIGSTOP);
printf("clone returns %d\n", pid);
if(pid < 0)
{
exit(0);
}
#endif
return 0;
}
why does clone() give me this error?
Thank you in advance.

It's not your fault. It's a bug in GDB: https://sourceware.org/bugzilla/show_bug.cgi?id=18006

Related

wait() hangs when CLONE_THREAD

I am tracing some processes and their children using ptrace. I am trying to print specific system call (using Seccomp filter that notifies ptrace, see this blogpost).
In most cases my code (see below) is working fine. However, when I am tracing a java program (from the default-jre package), the latter clones using the CLONE_THREAD flag. And for some reason, my tracer hangs (I believe) because I can't receive signals from the cloned process. I think the reason is that (according to this discussion) the child process in fact becomes a child of the original process' parent, instead of becoming the original process' child.
I reproduced this issue by using a simple program that simply calls clone() with flags and perform actions. When I used the when I use CLONE_THREAD | CLONE_SIGHAND | CLONE_VM flags (as clone() documentation specifies they should come together since Linux 2.6.0), at least I am able to trace everything correctly until one of the two thread finishes.
I would like to trace both thread independently. Is it possible?
More importantly, I need to trace a Java program, and I cannot change it. Here a strace of the Java program clone call:
[...]
4665 clone(child_stack=0x7fb166e95fb0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tid=[4666], tls=0x7fb166e96700, child_tidptr=0x7fb166e969d0) = 4666
[...]
So Java seems to respect the rules. I wanted to experiment to understand: I ruled out any flags unrelated to thread (i.e., `CLONE_FS | CLONE_FILES | CLONE_SYSVSEM).
Here are the results of running my test program with different combination of flags (I know, I am really desperate):
CLONE_VM|CLONE_SIGHAND|CLONE_THREAD|CLONE_SETTLS: only gets trace from parent
CLONE_VM|CLONE_SIGHAND|CLONE_THREAD|CLONE_PARENT_SETTID: inconsistent; gets trace from both until the parent finishes
CLONE_VM|CLONE_SIGHAND|CLONE_THREAD|CLONE_CHILD_CLEARTID: inconsistent; gets trace from both until the child finishes
CLONE_VM|CLONE_SIGHAND|CLONE_THREAD|CLONE_SETTLS|CLONE_PARENT_SETTID: only gets trace from parent
CLONE_VM|CLONE_SIGHAND|CLONE_THREAD|CLONE_SETTLS|CLONE_CHILD_CLEARTID: only gets trace from parent
CLONE_VM|CLONE_SIGHAND|CLONE_THREAD|CLONE_PARENT_SETTID|CLONE_SETTLS: only gets trace from parent
CLONE_VM|CLONE_SIGHAND|CLONE_THREAD|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID: inconsistent; gets trace from both until the child finishes
CLONE_VM|CLONE_SIGHAND|CLONE_THREAD|CLONE_CHILD_CLEARTID|CLONE_SETTLS: only gets trace from parent
CLONE_VM|CLONE_SIGHAND|CLONE_THREAD|CLONE_CHILD_CLEARTID|CLONE_PARENT_SETTID: inconsistent; gets trace from both until the child finishes
CLONE_VM|CLONE_SIGHAND|CLONE_THREAD|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID:
only gets trace from parent
So at least I get the same behaviour from my program and the Java program: it does not work.
How can I make it work? For instance, how does strace successfully traces any kind of clone? I tried to dig into its code but I can't find how they are doing it.
Any help might appreciated!
Best regards,
The tracer code (compile with g++ tracer.cpp -o tracer -g -lseccomp -lexplain):
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <errno.h>
#include <stddef.h>
#include <sys/ptrace.h>
#include <sys/reg.h>
#include <signal.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <sys/user.h>
#include <sys/prctl.h>
#include <fcntl.h>
#include <linux/limits.h>
#include <linux/filter.h>
#include <linux/seccomp.h>
#include <linux/unistd.h>
#include <libexplain/waitpid.h>
#include <tuple>
#include <vector>
#define DEFAULT_SIZE 1000
#define MAX_SIZE 1000
int process_signals();
int inspect(pid_t);
void read_string_into_buff(const pid_t, unsigned long long, char *, unsigned int);
int main(int argc, char **argv){
pid_t pid;
int status;
if (argc < 2) {
fprintf(stderr, "Usage: %s <prog> <arg1> ... <argN>\n", argv[0]);
return 1;
}
if ((pid = fork()) == 0) {
/* If execve syscall, trace */
struct sock_filter filter[] = {
BPF_STMT(BPF_LD+BPF_W+BPF_ABS, offsetof(struct seccomp_data, nr)),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_getpid, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_TRACE),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
};
struct sock_fprog prog = {
.len = (unsigned short) (sizeof(filter)/sizeof(filter[0])),
.filter = filter,
};
ptrace(PTRACE_TRACEME, 0, 0, 0);
if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0) == -1) {
perror("prctl(PR_SET_NO_NEW_PRIVS)");
return 1;
}
if (prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &prog) == -1) {
perror("when setting seccomp filter");
return 1;
}
kill(getpid(), SIGSTOP);
return execvp(argv[1], argv + 1);
} else {
waitpid(pid, &status, 0);
ptrace(PTRACE_SETOPTIONS, pid, 0, PTRACE_O_TRACESECCOMP | PTRACE_O_TRACEFORK | PTRACE_O_TRACECLONE | PTRACE_O_TRACEVFORK );
ptrace(PTRACE_CONT, pid, 0, 0);
process_signals();
return 0;
}
}
int process_signals(){
int status;
while (1){
pid_t child_pid;
// When child status changes
if ((child_pid = waitpid(-1, &status, 0)) < 0){
fprintf(stderr, "%s\n", explain_waitpid(child_pid, &status, 0));
exit(EXIT_FAILURE);
}
//printf("Sigtrap received\n");
// Checking if it is thanks to seccomp
if (status >> 8 == (SIGTRAP | (PTRACE_EVENT_SECCOMP << 8))){
// Perform argument inspection with ptrace
int syscall = inspect(child_pid);
}
// Resume no matter what
ptrace(PTRACE_CONT, child_pid, 0, 0);
}
}
int inspect(pid_t pid){
printf("From PID: %d\n", pid);
struct user_regs_struct regs;
ptrace(PTRACE_GETREGS, pid, 0, &regs);
// Get syscall number
int syscall = regs.orig_rax;
printf("------\nCaught syscall: %d\n", syscall);
if (syscall == __NR_getpid){
printf("Getpid detected\n");
}
return syscall;
}
void read_string_into_buff(const pid_t pid, unsigned long long addr, char * buff, unsigned int max_len){
/* Are we aligned on the "start" front? */
unsigned int offset=((unsigned long)addr)%sizeof(long);
addr-=offset;
unsigned int i=0;
int done=0;
int word_offset=0;
while( !done ) {
unsigned long word=ptrace( PTRACE_PEEKDATA, pid, addr+(word_offset++)*sizeof(long), 0 );
// While loop to stop at the first '\0' char indicating end of string
while( !done && offset<sizeof(long) && i<max_len ) {
buff[i]=((char *)&word)[offset]; /* Endianity neutral copy */
done=buff[i]=='\0';
++i;
++offset;
}
offset=0;
done=done || i>=max_len;
}
}
The sample program (compile with gcc sample.c -o sample):
#define _GNU_SOURCE
#include <stdio.h>
#include <sched.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <unistd.h>
#include <signal.h>
#define FLAGS CLONE_VM|CLONE_SIGHAND|CLONE_THREAD|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID
int fn(void *arg)
{
printf("\nINFO: This code is running under child process.\n");
int i = 0;
int n = atoi(arg);
for ( i = 1 ; i <= 10 ; i++ )
printf("[%d] %d * %d = %d\n", getpid(), n, i, (n*i));
printf("\n");
return 0;
}
void main(int argc, char *argv[])
{
printf("[%d] Hello, World!\n", getpid());
void *pchild_stack = malloc(1024 * 1024);
if ( pchild_stack == NULL ) {
printf("ERROR: Unable to allocate memory.\n");
exit(EXIT_FAILURE);
}
int pid = clone(fn, pchild_stack + (1024 * 1024), FLAGS, argv[1]);
if ( pid < 0 ) {
printf("ERROR: Unable to create the child process.\n");
exit(EXIT_FAILURE);
}
fn(argv[1]);
wait(NULL);
free(pchild_stack);
printf("INFO: Child process terminated.\n");
}
You can test what you want by running ./tracer ./sample. You can also test the original test case ./tracer java and observe that both the tracer and java hangs.
ANSWER:
As pointed it out in the comment, I had issues in that example that were preventing me from handling signals from the child.
In my original code (not listed here because too complex), I was only attaching ptrace AFTER the processes started... and I was only attaching to PID listed by pstree. My mistake was that I omitted the threads (and java is one program that does create threads), explaining why I had issue tracing java only.
I modified the code to attach to all the children process and thread (ps -L -g <Main_PID> -o tid=) and everything works again.
Your sample program has a bug: it may free the second thread’s stack before that thread exits, causing a SEGV. And your tracer just doesn’t handle signals well.
If the traced program gets a signal, your tracer intercepts it, not passing it down to the program. When it continues the program, it continues from the very same operation that caused SEGV, so it gets SEGV again. Ad infinitum. Both the tracer and the tracee appear to hang but in fact, they are in an infinite loop.
Rewriting the continuation like the following seems to work:
if (status >> 8 == (SIGTRAP | (PTRACE_EVENT_SECCOMP << 8))){
// Perform argument inspection with ptrace
int syscall = inspect(child_pid);
ptrace(PTRACE_CONT, child_pid, 0, 0);
} else if (WIFSTOPPED(status)) {
ptrace(PTRACE_CONT, child_pid, 0, WSTOPSIG(status));
} else {
ptrace(PTRACE_CONT, child_pid, 0, 0);
}
Not sure of Java but it seems to get SEGVs in regular operation...

Cuda: how to reset GPU after "sticky" error? [duplicate]

I have a working app which uses Cuda / C++, but sometimes, because of memory leaks, throws exception. I need to be able to reset the GPU on live, my app is a server so it has to stay available.
I tried something like this, but it doesnt seems to work:
try
{
// do process using GPU
}
catch (std::exception &e)
{
// catching exception from cuda only
cudaSetDevice(0);
CUDA_RETURN_(cudaDeviceReset());
}
My idea is to reset the device each times I get an exception from the GPU, but I cannot manage to make it working. :(
Btw, for some reasons, I cannot fix every problems of my Cuda code, I need a temporary solution. Thanks !
The only method to restore proper device functionality after a non-recoverable ("sticky") CUDA error is to terminate the host process that initiated (i.e. issued the CUDA runtime API calls that led to) the error.
Therefore, for a single-process application, the only method is to terminate the application.
It should be possible to design a multi-process application, where the initial ("parent") process makes no usage of CUDA whatsoever, and spawns a child process that uses the GPU. When the child process encounters an unrecoverable CUDA error, it must terminate.
The parent process can, optionally, monitor the child process. If it determines that the child process has terminated, it can re-spawn the process and restore CUDA functional behavior.
Sticky vs. non-sticky errors are covered elsewhere, such as here.
An example of a proper multi-process app that uses e.g. fork() to spawn a child process that uses CUDA is available in the CUDA sample code simpleIPC. Here is a rough example assembled from the simpleIPC example (for linux):
$ cat t477.cu
/*
* Copyright 1993-2015 NVIDIA Corporation. All rights reserved.
*
* Please refer to the NVIDIA end user license agreement (EULA) associated
* with this source code for terms and conditions that govern your use of
* this software. Any use, reproduction, disclosure, or distribution of
* this software and related documentation outside the terms of the EULA
* is strictly prohibited.
*
*/
// Includes
#include <stdio.h>
#include <assert.h>
// CUDA runtime includes
#include <cuda_runtime_api.h>
// CUDA utilities and system includes
#include <helper_cuda.h>
#define MAX_DEVICES 1
#define PROCESSES_PER_DEVICE 1
#define DATA_BUF_SIZE 4096
#ifdef __linux
#include <unistd.h>
#include <sched.h>
#include <sys/mman.h>
#include <sys/wait.h>
#include <linux/version.h>
typedef struct ipcDevices_st
{
int count;
int results[MAX_DEVICES];
} ipcDevices_t;
// CUDA Kernel
__global__ void simpleKernel(int *dst, int *src, int num)
{
// Dummy kernel
int idx = blockIdx.x * blockDim.x + threadIdx.x;
dst[idx] = src[idx] / num;
}
void runTest(int index, ipcDevices_t* s_devices)
{
if (s_devices->results[0] == 0){
simpleKernel<<<1,1>>>(NULL, NULL, 1); // make a fault
cudaDeviceSynchronize();
s_devices->results[0] = 1;}
else {
int *d, *s;
int n = 1;
cudaMalloc(&d, n*sizeof(int));
cudaMalloc(&s, n*sizeof(int));
simpleKernel<<<1,1>>>(d, s, n);
cudaError_t err = cudaDeviceSynchronize();
if (err != cudaSuccess)
s_devices->results[0] = 0;
else
s_devices->results[0] = 2;}
cudaDeviceReset();
}
#endif
int main(int argc, char **argv)
{
ipcDevices_t *s_devices = (ipcDevices_t *) mmap(NULL, sizeof(*s_devices),
PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, 0, 0);
assert(MAP_FAILED != s_devices);
// We can't initialize CUDA before fork() so we need to spawn a new process
s_devices->count = 1;
s_devices->results[0] = 0;
printf("\nSpawning child process\n");
int index = 0;
pid_t pid = fork();
printf("> Process %3d\n", pid);
if (pid == 0) { // child process
// launch our test
runTest(index, s_devices);
}
// Cleanup and shutdown
else { // parent process
int status;
waitpid(pid, &status, 0);
if (s_devices->results[0] < 2) {
printf("first process launch reported error: %d\n", s_devices->results[0]);
printf("respawn\n");
pid_t newpid = fork();
if (newpid == 0) { // child process
// launch our test
runTest(index, s_devices);
}
// Cleanup and shutdown
else { // parent process
int status;
waitpid(newpid, &status, 0);
if (s_devices->results[0] < 2)
printf("second process launch reported error: %d\n", s_devices->results[0]);
else
printf("second process launch successful\n");
}
}
}
printf("\nShutting down...\n");
exit(EXIT_SUCCESS);
}
$ nvcc -I/usr/local/cuda/samples/common/inc t477.cu -o t477
$ ./t477
Spawning child process
> Process 10841
> Process 0
Shutting down...
first process launch reported error: 1
respawn
Shutting down...
second process launch successful
Shutting down...
$
For windows, the only changes need should be to use a windows IPC mechanism for host interprocess communication.

How does execve prevents vulnerabilities compared to system command

I am referring to this link,
Basically, consider the input happy'; useradd 'attacker, the security advice differentiates between a compliant and non-compliant code -
Non Complaint Code
#include <string.h>
#include <stdlib.h>
enum { BUFFERSIZE = 512 };
void func(const char *input) {
char cmdbuf[BUFFERSIZE];
int len_wanted = snprintf(cmdbuf, BUFFERSIZE,
"any_cmd '%s'", input);
if (len_wanted >= BUFFERSIZE) {
/* Handle error */
} else if (len_wanted < 0) {
/* Handle error */
} else if (system(cmdbuf) == -1) {
/* Handle error */
}
}
Compliant Code
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
void func(char *input) {
pid_t pid;
int status;
pid_t ret;
char *const args[3] = {"any_exe", input, NULL};
char **env;
extern char **environ;
/* ... Sanitize arguments ... */
pid = fork();
if (pid == -1) {
/* Handle error */
} else if (pid != 0) {
while ((ret = waitpid(pid, &status, 0)) == -1) {
if (errno != EINTR) {
/* Handle error */
break;
}
}
if ((ret != -1) &&
(!WIFEXITED(status) || !WEXITSTATUS(status)) ) {
/* Report unexpected child status */
}
} else {
/* ... Initialize env as a sanitized copy of environ ... */
if (execve("/usr/bin/any_cmd", args, env) == -1) {
/* Handle error */
_Exit(127);
}
}
}
Assume we pass the same input to both the function with equal privilege, i.e run by root etc etc, How does the second solution ensure that command injection attack is repelled?
My only guess is that, execve will refresh your binary image with any_cmdand use input happy'; useradd 'attacker as args to any_cmd. So we will be have a return value equivalent to "invalid parameters". Is my understanding right? Or is there something deeper than my understanding which I am missing?
The main difference is indeed that with the system function you can launch whatever your shell can execute, so you basically can have shell injections with multiple commands. Whereas with execve first you specify a specific binary to execute, so you're pretty much sure that there is only one command executed (except if you execve a shell..). Also since you give a complete path to execve you avoid hacks based on modifying the HOME or the current working directory.
So yes, your understanding is rather right

How can you use CaptureStackBackTrace to capture the exception stack, not the calling stack?

I marked up the following code:
#include "stdafx.h"
#include <process.h>
#include <iostream>
#include <Windows.h>
#include <dbghelp.h>
using namespace std;
#define TRACE_MAX_STACK_FRAMES 1024
#define TRACE_MAX_FUNCTION_NAME_LENGTH 1024
int printStackTrace()
{
void *stack[TRACE_MAX_STACK_FRAMES];
HANDLE process = GetCurrentProcess();
SymInitialize(process, NULL, TRUE);
WORD numberOfFrames = CaptureStackBackTrace(0, TRACE_MAX_STACK_FRAMES, stack, NULL);
char buf[sizeof(SYMBOL_INFO)+(TRACE_MAX_FUNCTION_NAME_LENGTH - 1) * sizeof(TCHAR)];
SYMBOL_INFO* symbol = (SYMBOL_INFO*)buf;
symbol->MaxNameLen = TRACE_MAX_FUNCTION_NAME_LENGTH;
symbol->SizeOfStruct = sizeof(SYMBOL_INFO);
DWORD displacement;
IMAGEHLP_LINE64 line;
line.SizeOfStruct = sizeof(IMAGEHLP_LINE64);
for (int i = 0; i < numberOfFrames; i++)
{
DWORD64 address = (DWORD64)(stack[i]);
SymFromAddr(process, address, NULL, symbol);
if (SymGetLineFromAddr64(process, address, &displacement, &line))
{
printf("\tat %s in %s: line: %lu: address: 0x%0X\n", symbol->Name, line.FileName, line.LineNumber, symbol->Address);
}
else
{
printf("\tSymGetLineFromAddr64 returned error code %lu.\n", GetLastError());
printf("\tat %s, address 0x%0X.\n", symbol->Name, symbol->Address);
}
}
return 0;
}
void function2()
{
int a = 0;
int b = 0;
throw new exception;
}
void function1()
{
int a = 0;
function2();
}
void function0()
{
function1();
}
static void threadFunction(void *param)
{
try
{
function0();
}
catch (...)
{
printStackTrace();
}
}
int _tmain(int argc, _TCHAR* argv[])
{
_beginthread(threadFunction, 0, NULL);
printf("Press any key to exit.\n");
cin.get();
return 0;
}
What it does is, it logs a stack trace, but the problem is that the stack trace it logs does not give me the line numbers that I want. I want it to log the line numbers of the places that threw the exception, on and up the call stack, kind of like in C#. But what it actually does right now, is it outputs the following:
at printStackTrace in c:\users\<yourusername>\documents\visual studio 2013\pr
ojects\stacktracing\stacktracing\stacktracing.cpp: line: 17: address: 0x10485C0
at threadFunction in c:\users\<yourusername>\documents\visual studio 2013\pro
jects\stacktracing\stacktracing\stacktracing.cpp: line: 68: address: 0x10457C0
SymGetLineFromAddr64 returned error code 487.
at beginthread, address 0xF9431E0.
SymGetLineFromAddr64 returned error code 487.
at endthread, address 0xF9433E0.
SymGetLineFromAddr64 returned error code 487.
at BaseThreadInitThunk, address 0x7590494F.
SymGetLineFromAddr64 returned error code 487.
at RtlInitializeExceptionChain, address 0x7713986A.
SymGetLineFromAddr64 returned error code 487.
at RtlInitializeExceptionChain, address 0x7713986A.
The problem I am facing, once again, is that line: 68 in this trace corresponds to the line that calls the method printStackTrace();, while I would like it to give me line number 45, which corresponds to the line which throws the exception: throw new exception; and then continue further up the stack.
How can I achieve this sort of behavior and break into this thread exactly when it throws this exception in order to get a proper stack trace?
PS The code above was run for a console application using MSVC++ with unicode enabled on Windows 8.1 x64 machine, with the application being run as a Win32 application in Debug mode.
On Windows, unhandled C++ exception automatically generates SEH exception. SEH __except block allows to attach a filter that accepts _EXCEPTION_POINTERS structure as a parameter, which contains the pointer to the processor's context record in the moment exception was thrown. Passing this pointer to StackWalk64 function gives the stack trace in the moment of exception. So, this problem can be solved by using SEH-style exception handling instead of C++ style.
Example code:
#include <stdlib.h>
#include <locale.h>
#include <stdio.h>
#include <tchar.h>
#include <process.h>
#include <iostream>
#include <Windows.h>
#include "dbghelp.h"
using namespace std;
const int MaxNameLen = 256;
#pragma comment(lib,"Dbghelp.lib")
void printStack( CONTEXT* ctx ) //Prints stack trace based on context record
{
BOOL result;
HANDLE process;
HANDLE thread;
HMODULE hModule;
STACKFRAME64 stack;
ULONG frame;
DWORD64 displacement;
DWORD disp;
IMAGEHLP_LINE64 *line;
char buffer[sizeof(SYMBOL_INFO) + MAX_SYM_NAME * sizeof(TCHAR)];
char name[MaxNameLen];
char module[MaxNameLen];
PSYMBOL_INFO pSymbol = (PSYMBOL_INFO)buffer;
// On x64, StackWalk64 modifies the context record, that could
// cause crashes, so we create a copy to prevent it
CONTEXT ctxCopy;
memcpy(&ctxCopy, ctx, sizeof(CONTEXT));
memset( &stack, 0, sizeof( STACKFRAME64 ) );
process = GetCurrentProcess();
thread = GetCurrentThread();
displacement = 0;
#if !defined(_M_AMD64)
stack.AddrPC.Offset = (*ctx).Eip;
stack.AddrPC.Mode = AddrModeFlat;
stack.AddrStack.Offset = (*ctx).Esp;
stack.AddrStack.Mode = AddrModeFlat;
stack.AddrFrame.Offset = (*ctx).Ebp;
stack.AddrFrame.Mode = AddrModeFlat;
#endif
SymInitialize( process, NULL, TRUE ); //load symbols
for( frame = 0; ; frame++ )
{
//get next call from stack
result = StackWalk64
(
#if defined(_M_AMD64)
IMAGE_FILE_MACHINE_AMD64
#else
IMAGE_FILE_MACHINE_I386
#endif
,
process,
thread,
&stack,
&ctxCopy,
NULL,
SymFunctionTableAccess64,
SymGetModuleBase64,
NULL
);
if( !result ) break;
//get symbol name for address
pSymbol->SizeOfStruct = sizeof(SYMBOL_INFO);
pSymbol->MaxNameLen = MAX_SYM_NAME;
SymFromAddr(process, ( ULONG64 )stack.AddrPC.Offset, &displacement, pSymbol);
line = (IMAGEHLP_LINE64 *)malloc(sizeof(IMAGEHLP_LINE64));
line->SizeOfStruct = sizeof(IMAGEHLP_LINE64);
//try to get line
if (SymGetLineFromAddr64(process, stack.AddrPC.Offset, &disp, line))
{
printf("\tat %s in %s: line: %lu: address: 0x%0X\n", pSymbol->Name, line->FileName, line->LineNumber, pSymbol->Address);
}
else
{
//failed to get line
printf("\tat %s, address 0x%0X.\n", pSymbol->Name, pSymbol->Address);
hModule = NULL;
lstrcpyA(module,"");
GetModuleHandleEx(GET_MODULE_HANDLE_EX_FLAG_FROM_ADDRESS | GET_MODULE_HANDLE_EX_FLAG_UNCHANGED_REFCOUNT,
(LPCTSTR)(stack.AddrPC.Offset), &hModule);
//at least print module name
if(hModule != NULL)GetModuleFileNameA(hModule,module,MaxNameLen);
printf ("in %s\n",module);
}
free(line);
line = NULL;
}
}
//******************************************************************************
void function2()
{
int a = 0;
int b = 0;
throw exception();
}
void function1()
{
int a = 0;
function2();
}
void function0()
{
function1();
}
int seh_filter(_EXCEPTION_POINTERS* ex)
{
printf("*** Exception 0x%x occured ***\n\n",ex->ExceptionRecord->ExceptionCode);
printStack(ex->ContextRecord);
return EXCEPTION_EXECUTE_HANDLER;
}
static void threadFunction(void *param)
{
__try
{
function0();
}
__except(seh_filter(GetExceptionInformation()))
{
printf("Exception \n");
}
}
int _tmain(int argc, _TCHAR* argv[])
{
_beginthread(threadFunction, 0, NULL);
printf("Press any key to exit.\n");
cin.get();
return 0;
}
Example output (first two entries are noise, but the rest correctly reflects functions that caused exception):
*** Exception 0xe06d7363 occured ***
at RaiseException, address 0xFD3F9E20.
in C:\Windows\system32\KERNELBASE.dll
at CxxThrowException, address 0xDBB5A520.
in C:\Windows\system32\MSVCR110D.dll
at function2 in c:\work\projects\test\test.cpp: line: 146: address: 0x3F9C6C00
at function1 in c:\work\projects\test\test.cpp: line: 153: address: 0x3F9C6CB0
at function0 in c:\work\projects\test\test.cpp: line: 158: address: 0x3F9C6CE0
at threadFunction in c:\work\projects\test\test.cpp: line: 174: address: 0x3F9C6D70
at beginthread, address 0xDBA66C60.
in C:\Windows\system32\MSVCR110D.dll
at endthread, address 0xDBA66E90.
in C:\Windows\system32\MSVCR110D.dll
at BaseThreadInitThunk, address 0x773C6520.
in C:\Windows\system32\kernel32.dll
at RtlUserThreadStart, address 0x775FC520.
in C:\Windows\SYSTEM32\ntdll.dll
Another option is to create custom exception class that captures context in constructor and use it (or derived classes) to throw exceptions:
class MyException{
public:
CONTEXT Context;
MyException(){
RtlCaptureContext(&Context);
}
};
void function2()
{
throw MyException();
}
//...
try
{
function0();
}
catch (MyException& e)
{
printf("Exception \n");
printStack(&e.Context);
}
If you wanted to capture the stack backtrace of the point where the code threw an exception, you must capture the stack backtrace in the ctor of the exception object and store it within the exception object. Hence the part calling CaptureStackBackTrace() should be moved to the constructor of the exception object, which should also provide methods to fetch it either as a vector of addresses or as a vector of symbols. This is exactly how Throwable in Java and Exception in C# operate.
Finally, please do not write:
throw new exception;
in C++, as you would in C# or Java. This is an excellent way to both produce memory leaks and to fail to catch the exceptions by type (as you are throwing pointers to these types). Rather use:
throw exception();
I'm aware that this is an old question but people (including myself) are still finding it.
do you miss the call to below?
SymInitialize(process, NULL, TRUE);
SymSetOptions(SYMOPT_LOAD_LINES);

global static boolean pointer causes segmentation fault using pthread

New to pthread programming, and stuck on this error when working on a C++&C mixed code.
What I have done is to call the c code in the thread created by the c++ code. There is a static boolean pointer is_center used in the thread and should got free when the thread finishes.
However I noticed that every time when the program processed into the c function, the value of the boolean pointer would be changed and the segmentation fault then happened due to the free(). And the problem only happens when the c code is used. Remove the c code and the multi-thread c++ part works well.
Detail code is as follows:
static bool *is_center;
// omit other codes in between ...
void streamCluster( PStream* stream)
{
// some code here ...
while(1){
// some code here ...
is_center = (bool*)calloc(points.num,sizeof(bool));
// start the parallel thread here.
// the c code is invoked in this function.
localSearch(&points,kmin, kmax,&kfinal); // parallel
free(is_center);
}
And the function using parallel is as follows (my c code is invoked in each thread):
void localSearch( Points* points, long kmin, long kmax, long* kfinal ) {
pthread_barrier_t barrier;
pthread_t* threads = new pthread_t[nproc];
pkmedian_arg_t* arg = new pkmedian_arg_t[nproc];
pthread_barrier_init(&barrier,NULL,nproc);
for( int i = 0; i < nproc; i++ ) {
arg[i].points = points;
arg[i].kmin = kmin;
arg[i].kmax = kmax;
arg[i].pid = i;
arg[i].kfinal = kfinal;
arg[i].barrier = &barrier;
pthread_create(threads+i,NULL,localSearchSub,(void*)&arg[i]);
}
for ( int i = 0; i < nproc; i++) {
pthread_join(threads[i],NULL);
}
delete[] threads;
delete[] arg;
pthread_barrier_destroy(&barrier);
}
Finally the function calling my c code:
void* localSearchSub(void* arg_) {
int eventSet = PAPI_NULL;
begin_papi_thread(&eventSet);
pkmedian_arg_t* arg= (pkmedian_arg_t*)arg_;
pkmedian(arg->points,arg->kmin,arg->kmax,arg->kfinal,arg->pid,arg->barrier);
end_papi_thread(&eventSet);
return NULL;
}
And from gdb, what I have got for the is_center is:
Breakpoint 2, localSearchSub (arg_=0x600000000000bc40) at streamcluster.cpp:1711
1711 end_papi_thread(&eventSet);
(gdb) s
Hardware watchpoint 1: is_center
Old value = (bool *) 0x600000000000bba0
New value = (bool *) 0xa93f3
0x400000000000d8d1 in localSearchSub (arg_=0x600000000000bc40) at streamcluster.cpp:1711
1711 end_papi_thread(&eventSet);
Any suggestions? Thanks in advance!
Some new information about the code: for the c code, I am using the PAPI package. I write my own papi wrapper to initialize and read system counters. The code is as follows:
void begin_papi_thread(int* eventSet)
{
int thread_id = pthread_self();
// Events
if (PAPI_create_eventset(eventSet)) {
PAPI_perror(return_value, error_string, PAPI_MAX_STR_LEN);
printf("*** ERROR *** Failed to create event set for thread %d: %s\n.", thread_id, error_string);
}
if((return_value = PAPI_add_events(*eventSet, event_code, event_num)) != PAPI_OK)
{
printf("*** ERROR *** Failed to add event for thread %d: %d.\n", thread_id, return_value);
}
// Start counting
if ((return_value = PAPI_start(*eventSet)) != PAPI_OK) {
PAPI_perror(return_value, error_string, PAPI_MAX_STR_LEN);
printf("*** ERROR *** PAPI failed to start the event for thread %d: %s.\n", thread_id, error_string);
}
}
void end_papi_thread(int* eventSet)
{
int thread_id = pthread_self();
int i;
long long * count_values = (long long*)malloc(sizeof(long long) * event_num);
if (PAPI_read(*eventSet, count_values) != PAPI_OK)
printf("*** ERROR *** Failed to load count values.\n");
if (PAPI_stop(*eventSet, &dummy_values) != PAPI_OK) {
PAPI_perror(return_value, error_string, PAPI_MAX_STR_LEN);
printf("*** ERROR *** PAPI failed to stop the event for thread %d: %s.\n", thread_id, error_string);
return;
}
if(PAPI_cleanup_eventset(*eventSet) != PAPI_OK)
printf("*** ERROR *** Clean up failed for the thread %d.\n", thread_id);
}
I don't think you've posted enough code to really understand your problem, but it looks suspicious that you've declared is_center global. I assume you're using it in more than one place, possibly by multiple threads (localSearchSub mentions it, which is your worker thread function).
If is_center is being read or written by multiple threads, you probably want to protect it with a pthread mutex. You say it is "freed when the thread finishes", but you should be aware that there are nprocs threads, and it looks like they're all working on an array of is_center[points] bools. If points != nproc, this could b e a bad thing[1]. Each thread should probably work on its own array, and localSearch should aggregate the results.
The xxx_papi_thread functions don't get any hits on Google, so I can only imagine it's your own... unlikely we'll be able to help you, if the problem is in there :)
[1]: Even if points == nproc, it is not necessarily OK to write to different elements of an array from multiple threads (it's compiler and processor dependent). Be safe, use a mutex.
Also, this is tagged C++. Can you replace the calloc and dynamic arrays (using new) with vectors? It might end up easier to debug, and it certainly ends up easier to maintain. Why do you hate and want to punish the readers of your code? ;)