C++ windows threading and mutex issue - c++

I am a bit rusty with threaded programs especially in windows.
I have created a simple mex file in Matlab that is meant to read a number of files with each file being read in its own thread.
The file doesnt do anything really useful but is a precursor to a more complicated version that will use all of the functionality ive put into this file.
Here is the code:
#include <windows.h>
#include "mex.h"
#include <fstream>
typedef unsigned char uchar;
typedef unsigned int uint;
using namespace std;
int N;
int nThreads;
const int BLOCKSIZE = 1024;
char * buffer;
char * out;
HANDLE hIOMutex;
DWORD WINAPI runThread(LPVOID argPos) {
int pos = *(reinterpret_cast<int*>(argPos));
DWORD dwWaitResult = WaitForSingleObject( hIOMutex, INFINITE );
if (dwWaitResult == WAIT_OBJECT_0){
char buf[20];
sprintf(buf, "test%i.dat", pos);
ifstream ifs(buf, ios::binary);
if (!ifs.fail()) {
mexPrintf("Running thread:%i\n", pos);
for (int i=0; i<N/BLOCKSIZE;i++) {
if (ifs.eof()){
mexPrintf("File %s exited at i=%i\n", buf, (i-1)*BLOCKSIZE);
break;
}
ifs.read(&buffer[pos*BLOCKSIZE], BLOCKSIZE);
}
}
else {
mexPrintf("Could not open file %s\n", buf);
}
ifs.close();
ReleaseMutex( hIOMutex);
}
else
mexPrintf("The Mutex failed in thread:%i \n", pos);
return TRUE;
}
// 0 - N is data size
// 1 - nThreads is number of threads
// 2 - this is the output array
void mexFunction( int nlhs, mxArray *plhs[], int nrhs, const mxArray*prhs[] ) {
N = mxGetScalar(prhs[0]);
nThreads = mxGetScalar(prhs[1]);
out = (char*)mxGetData(prhs[2]);
buffer = (char*)malloc(BLOCKSIZE*nThreads);
hIOMutex= CreateMutex(NULL, FALSE, NULL);
HANDLE *hArr = (HANDLE*)malloc(sizeof(HANDLE)*nThreads);
int *tInd = (int*)malloc(sizeof(int)*nThreads);
for (int i=0;i<nThreads;i++){
tInd[i]=i;
hArr[i] = CreateThread( NULL, 0, runThread, &tInd[i], 0, NULL);
if (!hArr[i]) {
mexPrintf("Failed to start thread:%i\n", i);
break;
}
}
WaitForMultipleObjects( nThreads, hArr, TRUE, INFINITE);
for (int i=0;i<nThreads;i++)
CloseHandle(hArr[i]);
CloseHandle(hIOMutex);
mexEvalString("drawnow");
mexPrintf("Finished all threads.\n");
free(hArr);
free(tInd);
free(buffer);
I compile it like this in Matlab:
mex readFile.cpp
And then run it like this:
out = zeros(1024*1024,1,'uint8');
readFile(1024*1024,nFiles,out);
The problem is that when I set nFiles to be less than or equal to 64 everything works as expected and I get the following output:
Running thread:0
.
.
.
Running thread:62
Running thread:63
Finished all threads.
However when I set nFiles to 65 or larger I get:
Running thread:0
Running thread:1
Running thread:2
Running thread:3
The Mutex failed in thread:59
The Mutex failed in thread:60
The Mutex failed in thread:61
.
.
.
(up to nFiles-1)
Finished all threads.
I have also tested it without threading and it works fine.
I cannot see what Im doing wrong or why the cutoff to using the mutex would be so arbitrary so I am assuming there is something I am not taking into account.
Can anyone see where I have a blatant mistake relating to the error Im seeing?

In the documentation for WaitForMultipleObjects, "The maximum number of object handles is MAXIMUM_WAIT_OBJECTS.", which is 64 on most systems.
This is also (almost) a duplicate of this thread. The summary is really just that yes, the limit is 64, and also to use the information in the remarks section of WaitForMultipleObjects to build up a tree of threads to wait on.

Related

wait() hangs when CLONE_THREAD

I am tracing some processes and their children using ptrace. I am trying to print specific system call (using Seccomp filter that notifies ptrace, see this blogpost).
In most cases my code (see below) is working fine. However, when I am tracing a java program (from the default-jre package), the latter clones using the CLONE_THREAD flag. And for some reason, my tracer hangs (I believe) because I can't receive signals from the cloned process. I think the reason is that (according to this discussion) the child process in fact becomes a child of the original process' parent, instead of becoming the original process' child.
I reproduced this issue by using a simple program that simply calls clone() with flags and perform actions. When I used the when I use CLONE_THREAD | CLONE_SIGHAND | CLONE_VM flags (as clone() documentation specifies they should come together since Linux 2.6.0), at least I am able to trace everything correctly until one of the two thread finishes.
I would like to trace both thread independently. Is it possible?
More importantly, I need to trace a Java program, and I cannot change it. Here a strace of the Java program clone call:
[...]
4665 clone(child_stack=0x7fb166e95fb0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tid=[4666], tls=0x7fb166e96700, child_tidptr=0x7fb166e969d0) = 4666
[...]
So Java seems to respect the rules. I wanted to experiment to understand: I ruled out any flags unrelated to thread (i.e., `CLONE_FS | CLONE_FILES | CLONE_SYSVSEM).
Here are the results of running my test program with different combination of flags (I know, I am really desperate):
CLONE_VM|CLONE_SIGHAND|CLONE_THREAD|CLONE_SETTLS: only gets trace from parent
CLONE_VM|CLONE_SIGHAND|CLONE_THREAD|CLONE_PARENT_SETTID: inconsistent; gets trace from both until the parent finishes
CLONE_VM|CLONE_SIGHAND|CLONE_THREAD|CLONE_CHILD_CLEARTID: inconsistent; gets trace from both until the child finishes
CLONE_VM|CLONE_SIGHAND|CLONE_THREAD|CLONE_SETTLS|CLONE_PARENT_SETTID: only gets trace from parent
CLONE_VM|CLONE_SIGHAND|CLONE_THREAD|CLONE_SETTLS|CLONE_CHILD_CLEARTID: only gets trace from parent
CLONE_VM|CLONE_SIGHAND|CLONE_THREAD|CLONE_PARENT_SETTID|CLONE_SETTLS: only gets trace from parent
CLONE_VM|CLONE_SIGHAND|CLONE_THREAD|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID: inconsistent; gets trace from both until the child finishes
CLONE_VM|CLONE_SIGHAND|CLONE_THREAD|CLONE_CHILD_CLEARTID|CLONE_SETTLS: only gets trace from parent
CLONE_VM|CLONE_SIGHAND|CLONE_THREAD|CLONE_CHILD_CLEARTID|CLONE_PARENT_SETTID: inconsistent; gets trace from both until the child finishes
CLONE_VM|CLONE_SIGHAND|CLONE_THREAD|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID:
only gets trace from parent
So at least I get the same behaviour from my program and the Java program: it does not work.
How can I make it work? For instance, how does strace successfully traces any kind of clone? I tried to dig into its code but I can't find how they are doing it.
Any help might appreciated!
Best regards,
The tracer code (compile with g++ tracer.cpp -o tracer -g -lseccomp -lexplain):
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <errno.h>
#include <stddef.h>
#include <sys/ptrace.h>
#include <sys/reg.h>
#include <signal.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <sys/user.h>
#include <sys/prctl.h>
#include <fcntl.h>
#include <linux/limits.h>
#include <linux/filter.h>
#include <linux/seccomp.h>
#include <linux/unistd.h>
#include <libexplain/waitpid.h>
#include <tuple>
#include <vector>
#define DEFAULT_SIZE 1000
#define MAX_SIZE 1000
int process_signals();
int inspect(pid_t);
void read_string_into_buff(const pid_t, unsigned long long, char *, unsigned int);
int main(int argc, char **argv){
pid_t pid;
int status;
if (argc < 2) {
fprintf(stderr, "Usage: %s <prog> <arg1> ... <argN>\n", argv[0]);
return 1;
}
if ((pid = fork()) == 0) {
/* If execve syscall, trace */
struct sock_filter filter[] = {
BPF_STMT(BPF_LD+BPF_W+BPF_ABS, offsetof(struct seccomp_data, nr)),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_getpid, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_TRACE),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
};
struct sock_fprog prog = {
.len = (unsigned short) (sizeof(filter)/sizeof(filter[0])),
.filter = filter,
};
ptrace(PTRACE_TRACEME, 0, 0, 0);
if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0) == -1) {
perror("prctl(PR_SET_NO_NEW_PRIVS)");
return 1;
}
if (prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &prog) == -1) {
perror("when setting seccomp filter");
return 1;
}
kill(getpid(), SIGSTOP);
return execvp(argv[1], argv + 1);
} else {
waitpid(pid, &status, 0);
ptrace(PTRACE_SETOPTIONS, pid, 0, PTRACE_O_TRACESECCOMP | PTRACE_O_TRACEFORK | PTRACE_O_TRACECLONE | PTRACE_O_TRACEVFORK );
ptrace(PTRACE_CONT, pid, 0, 0);
process_signals();
return 0;
}
}
int process_signals(){
int status;
while (1){
pid_t child_pid;
// When child status changes
if ((child_pid = waitpid(-1, &status, 0)) < 0){
fprintf(stderr, "%s\n", explain_waitpid(child_pid, &status, 0));
exit(EXIT_FAILURE);
}
//printf("Sigtrap received\n");
// Checking if it is thanks to seccomp
if (status >> 8 == (SIGTRAP | (PTRACE_EVENT_SECCOMP << 8))){
// Perform argument inspection with ptrace
int syscall = inspect(child_pid);
}
// Resume no matter what
ptrace(PTRACE_CONT, child_pid, 0, 0);
}
}
int inspect(pid_t pid){
printf("From PID: %d\n", pid);
struct user_regs_struct regs;
ptrace(PTRACE_GETREGS, pid, 0, &regs);
// Get syscall number
int syscall = regs.orig_rax;
printf("------\nCaught syscall: %d\n", syscall);
if (syscall == __NR_getpid){
printf("Getpid detected\n");
}
return syscall;
}
void read_string_into_buff(const pid_t pid, unsigned long long addr, char * buff, unsigned int max_len){
/* Are we aligned on the "start" front? */
unsigned int offset=((unsigned long)addr)%sizeof(long);
addr-=offset;
unsigned int i=0;
int done=0;
int word_offset=0;
while( !done ) {
unsigned long word=ptrace( PTRACE_PEEKDATA, pid, addr+(word_offset++)*sizeof(long), 0 );
// While loop to stop at the first '\0' char indicating end of string
while( !done && offset<sizeof(long) && i<max_len ) {
buff[i]=((char *)&word)[offset]; /* Endianity neutral copy */
done=buff[i]=='\0';
++i;
++offset;
}
offset=0;
done=done || i>=max_len;
}
}
The sample program (compile with gcc sample.c -o sample):
#define _GNU_SOURCE
#include <stdio.h>
#include <sched.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <unistd.h>
#include <signal.h>
#define FLAGS CLONE_VM|CLONE_SIGHAND|CLONE_THREAD|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID
int fn(void *arg)
{
printf("\nINFO: This code is running under child process.\n");
int i = 0;
int n = atoi(arg);
for ( i = 1 ; i <= 10 ; i++ )
printf("[%d] %d * %d = %d\n", getpid(), n, i, (n*i));
printf("\n");
return 0;
}
void main(int argc, char *argv[])
{
printf("[%d] Hello, World!\n", getpid());
void *pchild_stack = malloc(1024 * 1024);
if ( pchild_stack == NULL ) {
printf("ERROR: Unable to allocate memory.\n");
exit(EXIT_FAILURE);
}
int pid = clone(fn, pchild_stack + (1024 * 1024), FLAGS, argv[1]);
if ( pid < 0 ) {
printf("ERROR: Unable to create the child process.\n");
exit(EXIT_FAILURE);
}
fn(argv[1]);
wait(NULL);
free(pchild_stack);
printf("INFO: Child process terminated.\n");
}
You can test what you want by running ./tracer ./sample. You can also test the original test case ./tracer java and observe that both the tracer and java hangs.
ANSWER:
As pointed it out in the comment, I had issues in that example that were preventing me from handling signals from the child.
In my original code (not listed here because too complex), I was only attaching ptrace AFTER the processes started... and I was only attaching to PID listed by pstree. My mistake was that I omitted the threads (and java is one program that does create threads), explaining why I had issue tracing java only.
I modified the code to attach to all the children process and thread (ps -L -g <Main_PID> -o tid=) and everything works again.
Your sample program has a bug: it may free the second thread’s stack before that thread exits, causing a SEGV. And your tracer just doesn’t handle signals well.
If the traced program gets a signal, your tracer intercepts it, not passing it down to the program. When it continues the program, it continues from the very same operation that caused SEGV, so it gets SEGV again. Ad infinitum. Both the tracer and the tracee appear to hang but in fact, they are in an infinite loop.
Rewriting the continuation like the following seems to work:
if (status >> 8 == (SIGTRAP | (PTRACE_EVENT_SECCOMP << 8))){
// Perform argument inspection with ptrace
int syscall = inspect(child_pid);
ptrace(PTRACE_CONT, child_pid, 0, 0);
} else if (WIFSTOPPED(status)) {
ptrace(PTRACE_CONT, child_pid, 0, WSTOPSIG(status));
} else {
ptrace(PTRACE_CONT, child_pid, 0, 0);
}
Not sure of Java but it seems to get SEGVs in regular operation...

What's the actual size of PSAPI_WORKING_SET_INFORMATION buffer used in QueryWorkingSet function of PSAPI.h

I'd like to use the function QueryWorkingSet available in PSAPI, but I'm having trouble to actually define the size of the buffer pv. Here is the code :
#include <Windows.h>
#include <Psapi.h>
#include <iostream>
void testQueryWorkingSet()
{
unsigned int counter;
HANDLE thisProcess = GetCurrentProcess();
SYSTEM_INFO si;
PSAPI_WORKING_SET_INFORMATION wsi, wsi2;
GetSystemInfo(&si);
QueryWorkingSet(thisProcess, &wsi, sizeof(wsi));
DWORD wsi2_buffer_size = (wsi.NumberOfEntries) * sizeof(PSAPI_WORKING_SET_BLOCK);
if (!QueryWorkingSet(thisProcess, &wsi2, wsi2_buffer_size))
{
std::cout << "ERROR CODE : " << GetLastError() << std::endl;
abort();
}
}
int main(int argc, char * argv[])
{
testQueryWorkingSet();
int* test = new int[1000000];
testQueryWorkingSet();
}
I keep ending up with abort() being called and either an error code 24 or 998 during the first call to testQueryWorkingSet(). that I interpret respectively as : wsi2_buffer_size is too low and wsi2_buffer_size is too big.
Now I have no idea of the value this variable should take, I tried :
counting everything including the NumberOfEntries field, that is DWORD wsi2_buffer_size = sizeof(wsi.NumberOfEntries) + wsi.NumberOfEntries * sizeof(PSAPI_WORKING_SET_BLOCK); => error 998;
counting only the number of entries, that is the code given above => error 998;
the size of the variable wsi2, that is DWORD wsi2_buffer_size = sizeof(wsi2); => error 24;
There has to be something I do not understand in the way we're supposed to use this function but I can't find what. I tried to adapt the code given there, that is :
#include <Windows.h>
#include <Psapi.h>
#include <iostream>
void testQueryWorkingSet()
{
unsigned int counter;
HANDLE thisProcess = GetCurrentProcess();
SYSTEM_INFO si;
PSAPI_WORKING_SET_INFORMATION wsi_1, * wsi;
DWORD wsi_size;
GetSystemInfo(&si);
wsi_1.NumberOfEntries = 0;
QueryWorkingSet(thisProcess, (LPVOID)&wsi_1, sizeof(wsi));
#if !defined(_WIN64)
wsi_1.NumberOfEntries--;
#endif
wsi_size = sizeof(PSAPI_WORKING_SET_INFORMATION)
+ sizeof(PSAPI_WORKING_SET_BLOCK) * wsi_1.NumberOfEntries;
wsi = (PSAPI_WORKING_SET_INFORMATION*)HeapAlloc(GetProcessHeap(),
HEAP_ZERO_MEMORY, wsi_size);
if (!QueryWorkingSet(thisProcess, (LPVOID)wsi, wsi_size)) {
printf("# Second QueryWorkingSet failed: %lu\n"
, GetLastError());
abort();
}
}
int main(int argc, char * argv[])
{
testQueryWorkingSet();
int* test = new int[1000000];
testQueryWorkingSet();
}
This code is working for only 1 call to testQueryWorkingSet(), the second one is aborting with error code 24. Here are the questions in brief :
How would you use QueryWorkingSet in a function that you could call multiple times successively?
What is representing the value of the parameter cb of the documentation given a PSAPI_WORKING_SET_INFORMATION?
Both examples are completely ignoring the return value and error code of the 1st call of QueryWorkingSet(). You are doing error handling only on the 2nd call.
Your 1st example fails because you are not taking into account the entire size of the PSAPI_WORKING_SET_INFORMATION when calculating wsi2_buffer_size for the 2nd call of QueryWorkingSet(). Even if the 1st call were successful, you are not allocating any additional memory for the 2nd call to fill in, if the NumberOfEntries returned is > 1.
Your 2nd example is passing in the wrong buffer size value to the cb parameter of the 1st call of QueryWorkingSet(). You are passing in just the size of a single pointer, not the size of the entire PSAPI_WORKING_SET_INFORMATION. Error 24 is ERROR_BAD_LENGTH. You need to use sizeof(wsi_1) instead of sizeof(wsi).
I would suggest calling QueryWorkingSet() in a loop, in case the working set actually changes in between the call to query its size and the call to get its data.
Also, be sure you free the memory you allocate when you are done using it.
With that said, try something more life this:
void testQueryWorkingSet()
{
HANDLE thisProcess = GetCurrentProcess();
PSAPI_WORKING_SET_INFORMATION *wsi, *wsi_new;
DWORD wsi_size;
ULONG_PTR count = 1; // or whatever initial size you want...
do
{
wsi_size = offsetof(PSAPI_WORKING_SET_INFORMATION, WorkingSetInfo[count]);
wsi = (PSAPI_WORKING_SET_INFORMATION*) HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, wsi_size);
if (!wsi)
{
printf("HeapAlloc failed: %lu\n", GetLastError());
abort();
}
if (QueryWorkingSet(thisProcess, wsi, wsi_size))
break;
if (GetLastError() != ERROR_BAD_LENGTH)
{
printf("QueryWorkingSet failed: %lu\n", GetLastError());
HeapFree(GetProcessHeap(), 0, wsi);
abort();
}
count = wsi->NumberOfEntries;
HeapFree(GetProcessHeap(), 0, wsi);
}
while (true);
// use wsi as needed...
HeapFree(GetProcessHeap(), 0, wsi);
}

linux c++ synchronization method both inter and intra process

The question is brought up when I developing a registry system (c/c++, 2.6.32-642.6.2.el6.x86_64 #1 SMP) used to bookmark information for each database, which requires locking for both inter and intra process. Normally, lockf(), flock(), fcntl() are obvious candidates for the inter process locking, but then I find out that they do not work as expected for intra-process locking(multi threads in same process).
I tested it using the following program:
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <fcntl.h> /* For O_RDWR */
#include <unistd.h> /* For open(), creat() */
#include <errno.h>
int counter = 0;
void* counterThread(void* ptr)
{
int lockfd = 0;
int tmpCounter = 0;
lockfd = open("/tmp/lockfile.txt", O_CREAT|O_WRONLY, 0666);
if(lockfd == -1)
{
printf("lockfile could not be created, errno:%d\n", errno);
return NULL;
}
if(lockf(lockfd, F_LOCK, 0) == -1)
{
printf("lockfile could not be locked, errno:%d\n", errno);
return NULL;
}
counter++;
tmpCounter = counter;
if(lockf(lockfd, F_ULOCK, 0) == -1)
{
printf("lockfile could not be unlocked, errno:%d\n", errno);
return NULL;
}
close(lockfd);
printf("counter is %d, lockfile is %d\n", tmpCounter, lockfd);
}
int main()
{
int threadNum = 30000;
pthread_t threads[30000];
int i = 0;
int rv = 0;
for(; i < threadNum; i++)
{
rv = pthread_create( &threads[i], NULL, &counterThread, NULL);
if(rv != 0)
{
printf("failed to create pthread %d\n", i);
return -1;
}
}
for(i = 0; i < threadNum; i++)
pthread_join(threads[i], NULL);
return 0;
}
The output would be:
counter is 1, lockfile is 4
counter is 2, lockfile is 3
counter is 3, lockfile is 5
counter is 4, lockfile is 6
counter is 7, lockfile is 4
...
counter is 29994, lockfile is 3
counter is 29995, lockfile is 3
counter is 29996, lockfile is 3
counter is 29997, lockfile is 3
counter is 29998, lockfile is 3
The output sequence is random and sometimes missing some numbers inside, meaning there is definitely a race condition happening. I think the reason is probably that fd opened for the same file in the same process is somehow optimized to be reused. Because all these locking mechanism is implemented in granularity of fd, the locking does not work in this case.
Given the background, I would like to ask the following question:
Is there any means I could force open to return different fd for different threads to same process to make the locking works?
Is there any good practice or convenient API in Linux to do both inter and intra process locking? What I could think of is the following means to implement it(not verified yet), but I would like to know some easier ways:
(1) Implement mutex and semaphore to serialize the access to all these lockfile APIs for the critical resources
(2) shm_open a shared memory, mmap it in different processes and add semaphore/mutex inside to lock the critical resources
Thanks in advance:)

Running an executable from a C++ program in the same process

Is that possible? I'd like an easy access to the executable's memory to edit it. Alternately, when I'm not the administrator, is it possible to edit the executable's memory from another process? I've tried the ptrace library and it fails if I'm not the administrator. I'm on Linux
I'm not entirely sure what you are asking, but this is possible with shared memory.
See here: http://www.kernel.org/doc/man-pages/online/pages/man7/shm_overview.7.html
This is what a debugger does. You could look at the code of an open source debugger, e.g. gdb, to see how it works.
The answer:
Yes - it works: you don't have to be administrator / root, but of course you need the rights to access the process' memory, i.e. same user.
No - it is not easy
The possibility to write to /proc/pid/mem was added some time ago to the Linux kernel. Therefore it depends on the kernel you are using. The small programs were checked with kernel 3.2 where this works and 2.6.32 where it fails.
The solution consists of two programs:
A 'server' which is started, allocates some memory, writes some pattern into this memory and outputs every three seconds the memory contents which is placed after the pattern is printed.
A 'client' which connects via the /proc/pid/maps and /proc/pid/mem to the server, searches for the pattern and writes some other string into the server's memory.
The implementation uses heap - but as long as the permissions allow - it is also possible to change other portions of the other process' memory.
This is implemented in C, because it is very 'low level' - but it should work in C++. It is a proof of concept - no production code - e.g. there are some error checks missing and it has some fixed size buffers.
memholder.c
/*
* Alloc memory - write in some pattern and print out the some bytes
* after the pattern.
*
* Compile: gcc -Wall -Werror memholder.c -o memholder.o
*/
#include <sys/types.h>
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
int main() {
char * m = (char*) malloc(2048);
memset(m, '\xAA', 1024);
strcpy(m + 1024, "Some local data.");
printf("PID: %d\n", getpid());
while(1) {
printf("%s\n", m + 1024);
sleep(3);
}
return 0;
}
memwriter.c
/*
* Searches for a pattern in the given PIDs memory
* and changes some bytes after them.
*
* Compile: gcc -Wall -std=c99 -Werror memwriter.c -o memwriter
*/
#include <sys/types.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#include <sys/mman.h>
#include <sys/ptrace.h>
#include <sys/wait.h>
int open_proc_file(pid_t other_pid, char const * const sn,
int flags) {
char fname[1024];
snprintf(fname, 1023, "/proc/%d/%s", other_pid, sn);
// Open file for reading and writing
int const fd = open(fname, flags );
if(fd==-1) {
perror("Open file");
exit(1);
}
return fd;
}
void get_heap(int fd_maps, size_t * heap_start, size_t * heap_end) {
char buf[65536];
ssize_t const r = read(fd_maps, buf, 65535);
if(r==-1) {
perror("Reading maps file");
exit(1);
}
buf[r] = '\0';
char * const heap = strstr(buf, "[heap]");
if(heap==NULL) {
printf("[heap] not found in maps file");
exit(1);
}
// Look backward to the latest newline
char const * hl_start;
for(hl_start = heap; hl_start > buf && *hl_start != '\n';
--hl_start) {}
// skip \n
++hl_start;
// Convert to beginnig and end address
char * lhe;
*heap_start = strtol(hl_start, &lhe, 16);
++lhe;
*heap_end = strtol(lhe, &lhe, 16);
}
int main(int argc, char *argv[]) {
if(argc!=2) {
printf("Usage: memwriter <pid>\n");
return 1;
}
pid_t const other_pid = atoi(argv[1]);
int fd_mem = open_proc_file(other_pid, "mem", O_RDWR);
int fd_maps = open_proc_file(other_pid, "maps", O_RDONLY);
size_t other_mem_start;
size_t other_mem_end;
get_heap(fd_maps, &other_mem_start, &other_mem_end);
ptrace(PTRACE_ATTACH, other_pid, NULL, NULL);
waitpid(other_pid, NULL, 0);
if( lseek(fd_mem, other_mem_start, SEEK_SET) == -1 ) {
perror("lseek");
return 1;
}
char buf[512];
do {
ssize_t const r = read(fd_mem, buf, 512);
if(r!=512) {
perror("read?");
break;
}
// Check for pattern
int pat_found = 1;
for(int i = 0; i < 512; ++i) {
if( buf[i] != '\xAA' )
pat_found = 0;
break;
}
if( ! pat_found ) continue;
// Write about one k of strings
char const * const wbuf = "REMOTE DATA - ";
for(int i = 0; i < 70; ++i) {
ssize_t const w = write(fd_mem, wbuf, strlen(wbuf));
if( w == -1) {
perror("Write");
return 1;
}
}
// Append a \0
write(fd_mem, "\0", 1);
break;
} while(1);
ptrace(PTRACE_DETACH, other_pid, NULL, NULL);
close(fd_mem);
close(fd_maps);
return 0;
}
Example output
$ ./memholder
PID: 2621
Some local data.
Some local data.
MOTE DATA - REMOTE DA...
Other interpretation
There is also another interpretation of your question (when reading the headline and not the question), that you want to replace the 'executable' from one process with another one. That can be easily handled by exec() (and friends):
From man exec:
The exec() family of functions replaces the current process image with a new process image.
In Windows, the methods used for this are named ReadProcessMemory / WriteProcessMemory, you will, however, need administrative rights for this. The same is for linux, as I've said in my comment, no sane system would allow user process to modify non-owned memory.
For linux, the only function is ptrace. You will need to be administrator.
http://cboard.cprogramming.com/cplusplus-programming/92093-readprocessmemory-writeprocessmemory-linux-equivalent.html contains more detailed discussion.
Can you imagine the consequences of allowing process to modify other process memory, without being administrator?

inotify notifies of a new file wrongly multiple times

Using inotify to monitor a directory for any new file created in the directory by adding a watch on the directory by
fd = inotify_init();
wd = inotify_add_watch(fd, "filename_with_path", IN_CLOSE_WRITE);
inotify_add_watch(fd, directory_name, IN_CLOSE_WRITE);
const int event_size = sizeof(struct inotify_event);
const int buf_len = 1024 * (event_size + FILENAME_MAX);
while(true) {
char buf[buf_len];
int no_of_events, count = 0;
no_of_events = read(fd, buf, buf_len);
while(count < no_of_events) {
struct inotify_event *event = (struct inotify_event *) &buf[count];
if (event->len) {
if (event->mask & IN_CLOSE_WRITE) {
if (!(event->mask & IN_ISDIR)) {
//It's here multiple times
}
}
}
count += event_size + event->len;
}
When I scp a file to the directory, this loops infinitely. What is the problem with this code ? It shows the same event name and event mask too. So , it shows that the event for the same, infinite times.
There are no break statements. If I find an event, I just print it and carry on waiting for another event on read(), which should be a blocking call. Instead, it starts looping infinitely. This means, read doesn't block it but returns the same value for one file infinitely.
This entire operation runs on a separate boost::thread.
EDIT:
Sorry all. The error I was getting was not because of the inotify but because of sqlite which was tricky to detect at first. I think I jumped the gun here. With further investigation, I did find that the inotify works perfectly well. But the error actually came from the sqlite command : ATTACH
That command was not a ready-only command as it was supposed to. It was writing some meta data to the file. So inotify gets notification again and again. Since they were happening so fast, it screwed up the application.I finally had to breakup the code to understand why.
Thanks everyone.
I don't see anything wrong with your code...I'm running basically the same thing and it's working fine. I'm wondering if there's a problem with the test, or some part of the code that's omitted. If you don't mind, let's see if we can remove any ambiguity.
Can you try this out (I know it's almost the same thing, but just humor me) and let me know the results of the exact test?
1) Put the following code into test.c
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>
#include <sys/inotify.h>
int main (int argc, char *argv[])
{
char target[FILENAME_MAX];
int result;
int fd;
int wd; /* watch descriptor */
const int event_size = sizeof(struct inotify_event);
const int buf_len = 1024 * (event_size + FILENAME_MAX);
strcpy (target, ".");
fd = inotify_init();
if (fd < 0) {
printf ("Error: %s\n", strerror(errno));
return 1;
}
wd = inotify_add_watch (fd, target, IN_CLOSE_WRITE);
if (wd < 0) {
printf ("Error: %s\n", strerror(errno));
return 1;
}
while (1) {
char buff[buf_len];
int no_of_events, count = 0;
no_of_events = read (fd, buff, buf_len);
while (count < no_of_events) {
struct inotify_event *event = (struct inotify_event *)&buff[count];
if (event->len){
if (event->mask & IN_CLOSE_WRITE)
if(!(event->mask & IN_ISDIR)){
printf("%s opened for writing was closed\n", target);
fflush(stdout);
}
}
count += event_size + event->len;
}
}
return 0;
}
2) Compile it with gcc:
gcc test.c
3) kick it off in one window:
./a.out
4) in a second window from the same directory try this:
echo "hi" > blah.txt
Let me know if that works correctly to show output every time the file is written to and does not loop as your code does. If so, there's something important your omiting from your code. If not, then there's some difference in the systems.
Sorry for putting this in the "answer" section, but too much for a comment.
My guess is that read is returning -1 and since you dont ever try to fix the error, you get another error on the next call to read which also returns -1.