I've been searching for a solution on Linux to find out what thread first aquired a fd but no luck for now.
/proc/pid/task/
shows the fd to be available to each thread, which makes sense since descriptors are available troughout the whole process space.
lsof
is of course not of much help either for this usecase.
The program is very complex, and strace or gdb won't help either, there are tons of closed source libraries used. File path is known but does not help since I don't have access to the code in the libaries. I suspect the fd leak is due to some race condition that occurrs very very rarely and I need to trace the thread that did opened the file.
One solution that would be easy to implement would be for me to add a log in the kernel file open handler or in the c library, but for good reasons I'm not able to alter nor the kernel neither the standard library.
Some suggestions?
If you have kernel symbols available use SystemTap:
sudo stap -e 'probe syscall.open.return { \
printf("tid=%d, fd=%d\n", tid(), $return) }'
Related
I am working on a big code base. It is heavily multithreaded.
After running the linux based application for a few hours, in the end, right before reporting, the application silences. It doesn't die, it doesn't crash, it just waits there. Joins, mutexes, condition variables ... any of these can be the culprit.
If it had crashed, I would at least have a chance to find the source using debugger. But this way, I have no clue how to use what tool to find the bug. I can't even post a code sample for you. The only thing that can possibly help is to tap MANY places with cout to get a visual where the application is.
Have you been in such a situation? What do you recommend?
If you're running under Linux then just use gdb to run the program. When the application 'silences', interrupt it with CTRL+C, then type backtrace to see the call stack. With this you will find out the function where your application was blocked.
Incase of linux, gdb will be great help. Another tool that can be of great help is strace (This can also be used where there are problems with program for with source is not readily available because strace does not need recompilation to trace them.)
strace shall intercept/record system calls that are called by a process and also the signals that are received by a process. It will be able to show the order of events and all the return/resumption paths of calls. This can take you almost closer to the area of problem.
iotop, LTTng and Ftrace are few of other tools that be helpful to you in this scenario.
I know that in a DOS/Windows application, you can issue system commands from code using lines like:
system("pause");
or
system("myProgram.exe");
...from stdlib.h. Is there a similar Linux command, and if so which header file would I find it in?
Also, is this considered bad programming practice? I am considering trying to get a list of loaded kernal modules using the lsmod command. Is that a good idea or bad idea? I found some websites that seemed to view system calls (at least system("pause");) in a negative light.
system is a bad idea for several reasons:
Your program is suspended until the command finishes.
It runs the command through a shell, which means you have to worry about making sure the string you pass is safe for the shell to evaluate.
If you try to run a backgrounded command with &, it ends up being a grandchild process and gets orphaned and taken in by the init process (pid 1), and you have no way of checking its status after that.
There's no way to read the command's output back into your program.
For the first and final issues, popen is one solution, but it doesn't address the other issues. You should really use fork and exec (or posix_spawn) yourself for running any external command/program.
Not surprisingly, the command is still
system("whatever");
and the header is still stdlib.h. That header file's name means "standard library", which means it's on every standard platform that supports C.
And yes, calling system() is often a bad idea. There are usually more programmatic ways of doing things.
If you want to see how lsmod works, you can always look-up its source code and see what the major system calls are that it makes. Then use those calls yourself.
A quick Google search turns up this link, which indicates that lsmod is reading the contents of /proc/modules.
Well, lsmod does it by parsing the /proc/modules file. That would be my preferred method.
I think what you are looking for are fork and exec.
Is there any way to get all opened sockets using c++? I know the lsof command and this is what I'm looking for, but how to use it in a c++ application?
The idea is to get the FD of an opened socket by its port number and the pid.
Just open the files in /proc/net, like /proc/net/tcp, /proc/net/udp, etc. No need to slog through the lsof sources. :)
If you don't want to copy/paste or reimplement chunks of the lsof code, and it doesn't build any useful libraries you could leverage, you can still open a pipe to an lsof process and peruse its output.
check the lsof source?
ftp://lsof.itap.purdue.edu/pub/tools/unix/lsof/
The lsof command is prepared specifically such that it can be used from other programs including C, see the section: OUTPUT FOR OTHER PROGRAMS of man lsof for more information. For example you can invoke lsof with -F p and it will output the pid of the processes prefixed with 'p':
$ lsof -F p /some/file
p1234
p4321
you can then use popen to execute this commmand in a child process and read from its standard output.
In C (or C++) you can use the NETLINK_SOCK_DIAG interface. This is a socket that gives you access to all that information. You can filter on various parameters such as the interface the socket is attached to.
The documentation is spares, however, but I wrote a low level test to see whether I could get events whenever someone opened a new socket, but that didn't work. That being said, to list existing sockets, you can use the following code as a good starting point:
Can the NETLINK/SOCK_DIAG interface be used to listen for `listen()` and `close()` events of said socket?
I think that this is cleaner than parsing the /proc/net/tcp and other similar files. You get the same information, but it comes to you in binary.
It may also be simpler to use the libnml library which is a layer over the socket. It does many additional verification on all the calls. However, just like the base NETLINK interface, it's not very well documented (i.e. whether binding is important, flags you can use with TCP or UDP, etc.) The good thing is: you can always read the source to better understand what's going on.
I have an application (the source for which I don't have), which can be invoked from command line like this
$ ./notmyapp
I want to know all the locations where the application is writing to. It outputs some files in the directory it is being called from, but I need to make sure that those are the only files that are created.
So, I need to isolate the application to find out which all files it created/edited while it was running.
How can I do this?
Some way using Perl or C or C++? Do any of the standard libraries in these languages have ways to do this?
strace, ktrace/kdump, truss, dtruss, or whatever other program your platform provides for tracing system calls is probably what you're looking for.
Expect lots of output from any of those. To figure out what files the application is reading and writing to, you might want to limit the output to just a few syscalls. strace -eopen ./notmyapp, for example.
The application might also fork off child processes to do some of its work. With most system call tracers, you'll have to be specific about tracing those child processes as well. With strace, that'd be strace -f ./notmyapp.
In unix systems, you can use strace to print out a trace of all the system calls made and signals received by a process:
$ strace ./notmyapp
grep can be used to limit the output to subset of system calls:
$ strace ./notmyapp 2>&1 | egrep '(open|write)'
you could use strace.
You say in response to rafl's answer that notmyapp is supposed to produce a prompt and wait for [...] inputs before doing something.
Put your inputs in advance into a plain text file (say, responses.txt), one input per line. Then use strace, as suggested to track calls to open() or write() piping in the contents of responses.txt:
$ strace -eopen -ewrite ./notmyapp < responses.txt
If you're expecting a lot of file access, then you may want to pipe the output to your favourite pager or editor:
$ strace -eopen -ewrite ./notmyapp < responses.txt | vim -R -
strace is a powerful tool. For more information, consult man strace.
You could try running it as a user which has no rights to write anywhere on any drive. Then you get an error message when it tries to create/write the first file. Log that directory/file and give write rights to it, then repeat until there are no more error messages.
I need to get a list of all opened ports on my machine and what application opened them.
I need to get this information programmatically.
Thanks.
You have to implement the following:
socket=ls -l /proc/<pid>/fd | grep socket | sed 's/.*socket:\[//' | sed 's/\]//'
grep $socket /proc/net/tcp
Parse the output from the previous command (second entry contains port information)
I was hoping a cleverer answer would appear. I did just this (programmatically in Python), in an attempt to rewrite a program called NetHogs. My version is here, specifically here is the module in Python used to parse the table from /proc. If you're not Python literate (go learn it), then take a look at the original NetHogs, which uses a blend of C/C++ (and is a bit painful to read hence the rewrite).
It's worth noting that extensive or quickly repeated attempts to parse socket information from /proc is very CPU intensive, as the operating system has to handle every syscall made, and parse internal structures dynamically. As such you'll find some caching, and timing assumptions made in the source of both projects I've linked you to.
The short of it is, you need to relate the socket inodes given for each process in /proc/<pid>/fd to the connections given in /proc/net/<proto>. Again, example parsing, and how to locate all of these are present in both projects.
exec('netstat');
The information about open files (including sockets) can be dug out of the /proc directory.
This article gives a lot of detail and gets you started.
ss -nltup
netstat -ltupn
lsof -iTCP -sTCP:LISTEN
Edit: Ah sorry, not programmatic. But helpful if you want to fork a process.
No point in re-inventing the wheel every time.