How to query the proc file system programmatically and equivalently on Windows? - c++

I am working on an implementation where I would need to store the entry point of every function visited in a hashmap. In order to create an effective hash function, I would need to know the minimum and maximum possible function entry point when a program is already loaded in memory.
What would be the ideal way to do this programmatically (possibly both on Windows and Linux) such that a program can possibly determine its minimum and maximum function entry points that is already loaded in memory.
I was thinking that may be I should query the loading address of the process and determine the process size, but on a second though, a process size may include stack and heap sizes which would be meaningless for me.
Probably what I am looking for is possibly /proc/<processid>/maps, so for linux the question might be, how to query the proc file system programmatically and what is the equivalent approach for windows?

I do not know how to access this information on Windows, but on Linux I see the following from man proc :
/proc/[pid]/maps
A file containing the currently mapped memory regions and their access permissions.
The format is:
address perms offset dev inode pathname
08048000-08056000 r-xp 00000000 03:0c 64593 /usr/sbin/gpm
08056000-08058000 rw-p 0000d000 03:0c 64593 /usr/sbin/gpm
08058000-0805b000 rwxp 00000000 00:00 0
40000000-40013000 r-xp 00000000 03:0c 4165 /lib/ld-2.2.4.so
40013000-40015000 rw-p 00012000 03:0c 4165 /lib/ld-2.2.4.so
4001f000-40135000 r-xp 00000000 03:0c 45494 /lib/libc-2.2.4.so
40135000-4013e000 rw-p 00115000 03:0c 45494 /lib/libc-2.2.4.so
4013e000-40142000 rw-p 00000000 00:00 0
bffff000-c0000000 rwxp 00000000 00:00 0
where "address" is the address space in the process that it occupies, "perms" is a set of permis-
sions:
r = read
w = write
x = execute
s = shared
p = private (copy on write)
"offset" is the offset into the file/whatever, "dev" is the device (major:minor), and "inode" is the
inode on that device. 0 indicates that no inode is associated with the memory region, as the case
would be with BSS (uninitialized data).
Under Linux 2.0 there is no field giving pathname.
You open the file, you parse it (possibly using fscanf), you look for segments with 'x' permission May be also w/o 'w'). Those are the addresses where functions can be found.

Related

How do I find why the virtual memory foot print continuously grows with this daemon?

I created a daemon which I use as a proxy to the Cassandra database. I call it snapdbproxy as it proxies my CQL commands from my other servers and tools.
Whenever I access that tool, it creates a new thread, manages various CQL commands, and then I cleanly exit the thread once the connection is lost.
Looking at the memory footprint, it grows really fast (the most active systems quickly reach Gb of virtual memory and that makes use of some swap memory which grows constantly.) On startup, it is around 300Mb.
The software is written in C++ with destructors, RAII, smart pointers, etc... but I still verified:
With -fsanitizer=address (I use g++ under Linux) and I get no leaks (okay, a very few... under 300 bytes because I can't find how to get rid of a few Cryto buffers created by OpenSSL)
With valgrind massif which says I use 4.7mB at initialization time and then under 4mB ongoing (I ran the same code for over 1h and same results!)
There is some output of ms_print (I removed the stack, since it's all zeroes).
-------------------------------------------------------------------
n time(i) total(B) useful-heap(B) extra-heap(B)
-------------------------------------------------------------------
0 0 0 0 0
1 78,110,172 4,663,704 4,275,532 388,172
2 172,552,798 3,600,840 3,369,538 231,302
3 269,590,806 3,611,600 3,379,648 231,952
4 350,518,548 3,655,208 3,420,483 234,725
5 425,873,410 3,653,856 3,419,390 234,466
...
67 4,257,283,952 3,693,160 3,459,545 233,615
68 4,302,665,173 3,694,624 3,460,827 233,797
69 4,348,046,440 3,693,728 3,457,524 236,204
70 4,393,427,319 3,685,064 3,449,697 235,367
71 4,438,812,133 3,698,352 3,461,918 236,434
As we can see, after one hour and many accesses from various other daemons (at least 100 accesses,) valgrind tells me that I am using only around 4mB of memory. I tried twice thinking that the first attempt probably failed. Same results.
So... I'm more or less out of ideas. Why would my process continue to grow in terms of virtual memory even though everything is correctly freed on exit of each thread--as shown by massif output--and the entire process--as shown by -fsanitizer=address (okay, I'm not showing the output of the sanitizer here, but trust me, it's under 300 bytes. Not Gb of leaks.)
There is the output of a watch command after a while as I'm looking at the memory (Virtual Memory) status:
Every 1.0s: grep ^Vm /proc/1773/status Tue Oct 2 21:36:42 2018
VmPeak: 1124060 kB <-- starts at under 300 Mb...
VmSize: 1124060 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 108776 kB
VmRSS: 108776 kB
VmData: 963920 kB <-- this tags along
VmStk: 132 kB
VmExe: 1936 kB
VmLib: 65396 kB
VmPTE: 888 kB <-- this increases too (necessary to handle the large Vm)
VmPMD: 20 kB
VmSwap: 0 kB
The VmPeak, VmSize, and VmData all increase each time the other daemons run (about once every 5 min.)
However, the memory (malloc/free) is not changing. I am now logging sbrk(0) (on an idea by 1201ProgramAlarm's comment--my interpretation of the first part of his comment) and that address remains the same:
sbrk() = 0x4228000
As suggested by phd, I looked at t he contents of /proc/<pid>/maps over time. Here is one or two increment. Unfortunate that I'm not told what creates these buffers. The only thing I could think of are my threads... (i.e. stack and a little space for the thread status)
--- a1 2018-10-02 21:50:21.887583577 -0700
+++ a2 2018-10-02 21:52:04.823169545 -0700
## -522,6 +522,10 ##
59dd0000-5a5d0000 rw-p 00000000 00:00 0
5a5d0000-5a5d1000 ---p 00000000 00:00 0
5a5d1000-5add1000 rw-p 00000000 00:00 0
+5add1000-5add2000 ---p 00000000 00:00 0
+5add2000-5b5d2000 rw-p 00000000 00:00 0
+5b5d2000-5b5d3000 ---p 00000000 00:00 0
+5b5d3000-5bdd3000 rw-p 00000000 00:00 0
802001000-802b8c000 rwxp 00000000 00:00 0
802b8c000-802b8e000 ---p 00000000 00:00 0
802b8e000-802c8e000 rwxp 00000000 00:00 0
Oh... Yep! My latest changes from having detached threads to joining... actually doesn't join threads at all. Testing with the proper join now... and it works right! My! Bad one!

Segmentation fault itself is hanging

I have had some problems with a server today and I have now boiled it down to that it is not able to get rid of processes that gets a segfault.
After the process gets a seg-fault, the process just keeps hanging, not getting killed.
A test that should cause the error Segmentation fault (core dumped).
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv)
{
char *buf;
buf = malloc(1<<31);
fgets(buf, 1024, stdin);
printf("%s\n", buf);
return 1;
}
Compile and set permissions with gcc segfault.c -o segfault && chmod +x segfault.
Running this (and pressing enter 1 time), on the problematic server causes it to hang. I also ran this on another server with the same kernel version (and most of the same packages), and it gets the seg-fault and then quits.
Here are the last few lines after running strace ./segfault on both of the servers.
Bad server
"\n", 1024) = 1
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0} ---
# It hangs here....
Working server
"\n", 1024) = 1
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0} ---
+++ killed by SIGSEGV (core dumped) +++
Segmentation fault (core dumped)
root#server { ~ }# echo $?
139
When the process hangs (after it have segfaulted), this is how it looks.
Not able to ^c it
root#server { ~ }# ./segfault
^C^C^C
Entry from ps aux
root 22944 0.0 0.0 69700 444 pts/18 S+ 15:39 0:00 ./segfault
cat /proc/22944/stack
[<ffffffff81223ca8>] do_coredump+0x978/0xb10
[<ffffffff810850c7>] get_signal_to_deliver+0x1c7/0x6d0
[<ffffffff81013407>] do_signal+0x57/0x6c0
[<ffffffff81013ad9>] do_notify_resume+0x69/0xb0
[<ffffffff8160bbfc>] retint_signal+0x48/0x8c
[<ffffffffffffffff>] 0xffffffffffffffff
Another funny thing is that I am unable to attach strace to a hanging segfault process. Doing so actually makes it getting killed.
root#server { ~ }# strace -p 1234
Process 1234 attached
+++ killed by SIGSEGV (core dumped) +++
ulimit -c 0 is sat and ulimit -c, ulimit -H -c, and ulimit -S -c all shows the value 0
Kernel version: 3.10.0-229.14.1.el7.x86_64
Distro-version: Red Hat Enterprise Linux Server release 7.1 (Maipo)
Running in vmware
The server is working as it should on everything else.
Update
Shutting down abrt (systemctl stop abrtd.service) fixed the problem with processes already hung after core-dump, and new processes core-dumping. Starting up abrt again did not bring back the problem.
Update 2016-01-26
We got a problem that looked similar, but not quite the same. The initial code used to test:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv)
{
char *buf;
buf = malloc(1<<31);
fgets(buf, 1024, stdin);
printf("%s\n", buf);
return 1;
}
was hanging. The output of cat /proc/<pid>/maps was
00400000-00401000 r-xp 00000000 fd:00 13143328 /root/segfault
00600000-00601000 r--p 00000000 fd:00 13143328 /root/segfault
00601000-00602000 rw-p 00001000 fd:00 13143328 /root/segfault
7f6c08000000-7f6c08021000 rw-p 00000000 00:00 0
7f6c08021000-7f6c0c000000 ---p 00000000 00:00 0
7f6c0fd5b000-7f6c0ff11000 r-xp 00000000 fd:00 14284 /usr/lib64/libc-2.17.so
7f6c0ff11000-7f6c10111000 ---p 001b6000 fd:00 14284 /usr/lib64/libc-2.17.so
7f6c10111000-7f6c10115000 r--p 001b6000 fd:00 14284 /usr/lib64/libc-2.17.so
7f6c10115000-7f6c10117000 rw-p 001ba000 fd:00 14284 /usr/lib64/libc-2.17.so
7f6c10117000-7f6c1011c000 rw-p 00000000 00:00 0
7f6c1011c000-7f6c1013d000 r-xp 00000000 fd:00 14274 /usr/lib64/ld-2.17.so
7f6c10330000-7f6c10333000 rw-p 00000000 00:00 0
7f6c1033b000-7f6c1033d000 rw-p 00000000 00:00 0
7f6c1033d000-7f6c1033e000 r--p 00021000 fd:00 14274 /usr/lib64/ld-2.17.so
7f6c1033e000-7f6c1033f000 rw-p 00022000 fd:00 14274 /usr/lib64/ld-2.17.so
7f6c1033f000-7f6c10340000 rw-p 00000000 00:00 0
7ffc13b5b000-7ffc13b7c000 rw-p 00000000 00:00 0 [stack]
7ffc13bad000-7ffc13baf000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
However, the smaller c code (int main(void){*(volatile char*)0=0;}) to trigger a segfault did cause a segfault and did not hang...
WARNING - this answer contains a number of suppositions based on the incomplete information to hand. Hopefully it is still useful though!
Why does the segfault appear to hang?
As the stack trace shows, the kernel is busy creating a core dump of the crashed process.
But why does this take so long? A likely explanation is that the method you are using to create the segfaults is resulting in the process having a massive virtual address space.
As pointed out in the comments by M.M., the outcome of the expression 1<<31 is undefined by the C standards, so it is difficult to say what actual value is being passed to malloc, but based on the subsequent behavior I am assuming it is a large number.
Note that for malloc to succeed it is not necessary for you to actually have this much RAM in your system - the kernel will expand the virtual size of your process but actual RAM will only be allocated when your program actually accesses this RAM.
I believe the call to malloc succeeds, or at least returns, because you state that it segfaults after you press enter, so after the call to fgets.
In any case, the segfault is leading the kernel to perform a core dump. If the process has a large virtual size, that could take a long time, especially if the kernel decides to dump all pages, even those that have never been touched by the process. I am not sure if it will do that, but if it did, and if there was not enough RAM in the system, it would have to begin swapping pages in and out of memory in order to dump them to the core dump. This would generate a high IO load which could lead to the process to appear to be unresponsive (and overall system performance would be degraded).
You may be able to verify some of this by looking in the abrtd dump directory (possibly /var/tmp/abrt, or check /etc/abrt/abrt.conf) where you may find the core dumps (or perhaps partial core dumps) that have been created.
If you are able to reproduce the behavior, then you can check:
/proc/[pid]/maps to see the address space map of the process and see if it really is large
Use a tool like vmstat to see if the the system is swapping, the amount of I/O going on, and how much IO Wait state is being experienced
If you had sar running then you may be able to see similar information even for the period prior to restarting abrtd.
Why is a core dump created, even though ulimit -c is 0?
According to this bug report, abrtd will trigger collection of a core dump regardless of ulimit settings.
Why did this not start happening again when arbtd was started up once more?
There are a couple of possible explanations for that. For one thing, it would depend on the amount of free RAM in the system. It might be that a single core dump of a large process would not take that long, and not be perceived as hanging, if there is enough free RAM and the system is not pushed to swap.
If in your initial experiments you had several processes in this state, then the symptoms would be far worse than is the case when just getting a single process to misbehave.
Another possibility is that the configuration of abrtd had been altered but the service not yet reloaded, so that when you restarted it, it began using the new configuration, perhaps changing it's behavior.
It is also possible that a yum update had updated abrtd, but not restarted it, so that when you restarted it, the new version was running.

how to return the address of all the accessable memory locations?

The question statement:
I understand that it's not possible to gain direct access to all the memory locations using c++ but all I want to do is to return all the accessible ram memory locations by the application. It becomes impossible to store this address list in a dynamic variable as we would require even more space to hold the address of the variables holding this address! But that doesn't matter as I could save it in a text file. Even one attempt to try to read such a memory location crashes the application. Instead of crashing, if the application provides a better error message I could write a code to obtain the address of all the accessible memory locations.
if there is a method to return this address list, I would like to know (I am using windows 7). Thanks in advance.
edit 1
I have tried this following idea to try and return if a memory location is accessible:
there are two codes included:
the first:
#include <stdlib.h>
#include <iostream>
#include <conio.h>
using namespace std;
int main()
{
if(!(system("\"C:\\Users\\ ...\\read0.exe \""))// I have not provided the complete path
cout<<"readable memory location";
else
cout<<"not readable";
getch();
return 0;
}
the second:
///one trying to read the accessible memory location
/// read0.cpp; released as read0.exe
int main()
{
int *a;//runs in 32-bit computer
a= (int*)0;
if((*a))
{/*do nothing; just try reading the memory location*/
}
return 0;
}
when the first console file opens the second, it causes the second file to crash; It results in making the operating system open a window to ask for either debugging or closing the application. This delays time and makes it more inefficient. So as a result I have been trying to find a way to temperately disable this message box from popping out and I have found this.
extended question
could you give the command line in windows 7 command prompt so that I could use the system command in c++ to set the dword value to zero and thereby disabling the error message during crashes. I am new to editing registry using cmd.
As far as I know this cannot be done portably.
In Linux there is /proc/<PID>/maps, which (I believe) gives the entire list of mapped virtual memory addresses for a process. It looks something like this:
00400000-004e5000 r-xp 00000000 08:01 11567110 /bin/bash
006e4000-006e5000 r--p 000e4000 08:01 11567110 /bin/bash
006e5000-006ee000 rw-p 000e5000 08:01 11567110 /bin/bash
006ee000-006f4000 rw-p 00000000 00:00 0
02243000-0242e000 rw-p 00000000 00:00 0 [heap]
7f1d1744e000-7f1d17459000 r-xp 00000000 08:01 5260634 /lib/x86_64-linux-gnu/libnss_files-2.13.so
7f1d17459000-7f1d17658000 ---p 0000b000 08:01 5260634 /lib/x86_64-linux-gnu/libnss_files-2.13.so
7f1d17658000-7f1d17659000 r--p 0000a000 08:01 5260634 /lib/x86_64-linux-gnu/libnss_files-2.13.so
7f1d17659000-7f1d1765a000 rw-p 0000b000 08:01 5260634 /lib/x86_64-linux-gnu/libnss_files-2.13.so
7f1d1765a000-7f1d17664000 r-xp 00000000 08:01 5260630 /lib/x86_64-linux-gnu/libnss_nis-2.13.so
7f1d17664000-7f1d17863000 ---p 0000a000 08:01 5260630 /lib/x86_64-linux-gnu/libnss_nis-2.13.so
7f1d17863000-7f1d17864000 r--p 00009000 08:01 5260630 /lib/x86_64-linux-gnu/libnss_nis-2.13.so
(34 more lines)

How to correct *** glibc detected *** error in the program [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
glibc detected error
Hi,
I was executing my project in GNU C++ when i received this error when i pressed an option in the switch case. As rest of the program is executing fine i am left with this error. I don't know what it is and why it occurs. Please explain and guide me as to where i may start to look in my program.
Error Details:
*** glibc detected *** ./test.out: free(): invalid pointer: 0xbfb1c874 ***
======= Backtrace: =========
/lib/libc.so.6[0x55c0f1]
/lib/libc.so.6(cfree+0x90)[0x55fbc0]
./test.out[0x809f855]
./test.out[0x804fbc0]
./test.out[0x804f9bb]
./test.out[0x80502bb]
./test.out[0x805084e]
./test.out[0x8050d07]
/lib/libc.so.6(__libc_start_main+0xdc)[0x508e8c]
./test.out[0x8049981]
======= Memory map: ========
004f3000-00631000 r-xp 00000000 08:01 6148422 /lib/libc-2.5.so
00631000-00633000 r-xp 0013e000 08:01 6148422 /lib/libc-2.5.so
00633000-00634000 rwxp 00140000 08:01 6148422 /lib/libc-2.5.so
00634000-00637000 rwxp 00634000 00:00 0
0078d000-007a7000 r-xp 00000000 08:01 6152013 /lib/ld-2.5.so
007a7000-007a8000 r-xp 00019000 08:01 6152013 /lib/ld-2.5.so
007a8000-007a9000 rwxp 0001a000 08:01 6152013 /lib/ld-2.5.so
007f9000-0081e000 r-xp 00000000 08:01 6148435 /lib/libm-2.5.so
0081e000-0081f000 r-xp 00024000 08:01 6148435 /lib/libm-2.5.so
0081f000-00820000 rwxp 00025000 08:01 6148435 /lib/libm-2.5.so
00b18000-00b23000 r-xp 00000000 08:01 6148439 /lib/libgcc_s-4.1.2-20080825.so.1
00b23000-00b24000 rwxp 0000a000 08:01 6148439 /lib/libgcc_s-4.1.2-20080825.so.1
08048000-080c6000 r-xp 00000000 00:1e 736543 /users/guest10/shashi/Demo/src/test.out
080c6000-080c7000 rwxp 0007e000 00:1e 736543 /users/guest10/shashi/Demo/src/test.out
080c7000-080cc000 rwxp 080c7000 00:00 0
08d05000-218b1000 rwxp 08d05000 00:00 0 [heap]
b7e00000-b7e21000 rwxp b7e00000 00:00 0
b7e21000-b7f00000 ---p b7e21000 00:00 0
b7fab000-b7fac000 rwxp b7fab000 00:00 0
b7fc4000-b7fc7000 rwxp b7fc4000 00:00 0
b7fc7000-b7fc8000 r-xp b7fc7000 00:00 0 [vdso]
bfb0b000-bfb21000 rw-p bffe9000 00:00 0 [stack]
Abort
Please Help.. Thanks in Adv
The exact solution can only be provided if you show us the code. The error is however clear. The code frees memory that is not or no longer valid. That means either the address is wrong, because for example of pointer arithmetic being done on the original pointer. Or the pointer has already been freed (double free).
You are most likely trying to free a memory that wasn't dynamically allocated. Maybe you have an unnecessary free or a typo like: free(&buf) instead of free(buf).
Compile your program with -g flag and run it through debugger or memory debugger. That will show you where the error exactly happens.
It looks like you are trying to free an invalid pointer. You can run the program with a memory check program like [Valgrind][1] like so:
valgrind --tool=memcheck --leak-check=full --track-origins=yes --show-reachable=yes --log-file=val.log ./<executable> <parameters>
Look at val.log and you should be able to figure out where your memory leaks occur. Also, you can try and step through the code with gdb/ddd (debuggers). The program will fail at the spot where the segmentation fault occurs. To make the code debuggable, you will need to re-compile your code with the -g flag.
Alternatively, you could post your code here and let the community see where you are going wrong.

glibc detected *** ./a.out: free(): invalid pointer:

when i compiled(c++ program) in linux i am getting following error pls help me
glibc detected *** ./a.out: free(): invalid pointer:0x0804878d ***
======= Backtrace: =========
/lib/libc.so.6[0xbd5f18]
/lib/libc.so.6(__libc_free+0x79)[0xbd941d]
/usr/lib/libstdc++.so.6(_ZdlPv+0x21)[0x3233fe1]
./a.out(__gxx_personality_v0+0x100)[0x8048514]
./a.out(__gxx_personality_v0+0x176)[0x804858a]
/lib/libc.so.6(__libc_start_main+0xdc)[0xb877e4]
./a.out(__gxx_personality_v0+0x5d)[0x8048471]
======= Memory map: ========
00b55000-00b6e000 r-xp 00000000 fd:00 6687029 /lib/ld-2.4.so
00b6e000-00b6f000 r-xp 00018000 fd:00 6687029 /lib/ld-2.4.so
00b6f000-00b70000 rwxp 00019000 fd:00 6687029 /lib/ld-2.4.so
00b72000-00c9e000 r-xp 00000000 fd:00 6687030 /lib/libc-2.4.so
00c9e000-00ca1000 r-xp 0012b000 fd:00 6687030 /lib/libc-2.4.so
00ca1000-00ca2000 rwxp 0012e000 fd:00 6687030 /lib/li
glibc detected * ./a.out: free(): invalid pointer:0x0804878d *
This means you probablydeleted a pointer that hasn't been created with new.
If you want any useful help, you really should post the code that generates this problem.
If you look at these two lines of the stack trace, you will see that the page that starts at 0x8048000 must be executable (because two addresses within that page, 0x8048514 and 0x804858a appear as return addresses on the stack).
./a.out(__gxx_personality_v0+0x100)[0x8048514]
./a.out(__gxx_personality_v0+0x176)[0x804858a]
The address you are trying to free, 0x0804878d, is at offset 0x78d in that same page so it probably points to code and definitely points within a page that is executable.