I am new to valgrind. I need to run the valgrind for a server written in C++. The server listens to a port. When the run the server inside the Valgrind, I couldn't communicate with the server. The Port is not listening.
valgrind --tool=memcheck --leak-check=yes --log-file=valgrind_log.txt /binary_path-c
I need the server should listen to the port when i run with valgrind.
If you have already confirmed that the exact same binary is doing that desired network socket open() and it doesn’t work in Valgrind, then read on.
Valgrind only works with binary file and cannot attach to an already running process (as explained here).
Valgrind is also sensitive to change of effective UID, particularly when running from root UID. You cannot use sudo with valgrind (detailed here).
You cannot Valgrind on an executable binary that has Linux capability bit enabled (details here).
Valgrind cannot handle root setuid on NFS filesystem (even when mounted to allow this). Workaround is to move your build or binary to non-NFS partition.
Having said all that above, it is a timing problem where Valgrind is taking things SLOWER and that the control flow of your code is “missing its mark” to performing that open to a network socket. Only way is to put in debug print statements throughout your code and nail that timing logic.
Alternatively...
To see what a production grade daemon is doing from the very beginning of startup, execute:
valgrind --trace-children=yes /usr/skin/<your-server-binary>
There’s another way to monitor network socket in action, read on ...
Tracing from start of execution
You can perform strace from the start and find out what network socket got opened (and described later, show its buffer content) by:
strace -eopen <your-server-binary> <server-arguments>
make a note of the desired fd (file descriptor) number.
As with any strace command in starting a process, pressing Ctrl-C will stop that process. But using strace on a live process, you safely detach using Ctrl-C from its targeted process (and let that process continue running) and return to your command shell prompt.
Attaching to already running server
But you could monitor an already running production daemon server using strace but it’s harder to find that opened fd number for your network socket. Do previous step briefly to get that fd.
Find out your PID using ps auxw.
Then plug in your server/daemon’s PID here:
strace -f -p <your-server-PID -fnetwork
to find out its fd number.
Exact socket monitoring
With the identified fd on hand, rerun strace to attach to that production server with:
strace -f -eread=<fd> -ewrite=<fd> -p<your-daemon-PID>
network troubleshooting checklist
lsof -i -n a list of open ports
strace
netstat -lt
tcpdump/wireshark
A list of network troubleshooting tools for Linux is given here, here and most comprehensively here.
Related
I am a total newbie and trying to debug an application which should receive a binary via TCP. When I launch it, it begins to listen several ports, I execute cat myfile | nc port_number in another terminal (apps terminal is busy) and the app runs. But I have no idea how to debug it in Qt creator. Everything I found on the Internet seems irrelevant to my problem.
I can find instructions online to break on accesses to memory addresses using gdb (Watch a memory range in gdb?) but I can't figure out how to do so for memory addresses on the guest machine when I use qemu.
You start qemu with gdb server listening on port 1234 by supplying -s to the qemu comman line. From qemu man page:
-s Shorthand for -gdb tcp::1234, i.e. open a gdbserver on TCP port
1234.
In additon to this, you can also use option -S which will stop Qemu from progressing until you connect gdb to it and issue continue command.
-S Do not start CPU at startup (you must type 'c' in the monitor).
From gdb, you connect to the gdb server running on qemu, by starting gdb (version of gdb that fits you guest architecture). Then connect to the gdb server by command (if qemu is running on the same machine):
(gdb) target remote :1234
References:
http://wiki.qemu.org/Documentation/Debugging
How to debug the Linux kernel with GDB and QEMU?
I used to the St-write to burn .bin to the STM32F4 and saw the message which I expected. Now, I hope to understand how GPIO init. Hence, I use OpenOCD and arm-none-eabi-gdb to do that. Here, it is my process.
$ minicom
$ openocd -f /opt/openocd/share/openocd/scripts/board/stm32f4discovery.cfg
$ arm-none-eabi-gdb main.elf
(gdb) target remote localhost:3333
(gdb) localhost:3333: Connection timed out.
How do I check the port of OpenOCD? Why does it occur timeout?
That certainly means that openocd did not start or that the port is busy.
Usually, you use :
openocd -f board/stm32f4discovery.cfg
You should check that your session is running.
Are you running a virtual linux machine on a windows host?
If so, you probably need to replace localhost with 10.0.0.2 (or whatever your windows IP is).
A good way to know, is to telnet to the openOCD address and port 4444 and see if you get the openOCD prompt, and can type a few commands.
I am trying to debug a device driver which is crashing the kernel on a Mac using a remote machine running gdb (trying to follow the instructions here). Both machines are connected to the same network by Ethernet (same router even, and both can access the network). I have also set nvram boot-args="debug=0x144" on the target and restarted.
I then load the kernel extension on the target as usual. On the host machine I start gdb like this:
$ gdb -arch i386 /Volumes/KernelDebugKit/mach_kernel
Once in gdb, I load the kernel macros and set up for remote attachment
(gdb) source /Volumes/KernelDebugKit/kgmacros
(gdb) target remote-kdp
(gdb) kdp-reattach 11.22.33.44
However, the last command then does not make a connection and I get an endless spool of
kdp_reply_wait: error from kdp_receive: receive timeout exceeded
kdp_transaction (remote_connect): transaction timed out
kdp_transaction (remote_connect): re-sending transaction
What is the correct way to get gdb connected to the target machine?
There are a number of ways to break into the target, including:
Kernel panic, as stated in your answer above.
Non-maskable interrupt, which is triggered by the cmd-option-ctrl-shift-esc key combination.
Code a break in your kernel extension using PE_enter_debugger(), which is declared in pexpert/pexpert.h
Halt at boot by setting DB_HALT (0x01) in the NVRAM boot-args value.
Additionally, you may need to set a persistent ARP table entry, as the target is unable to respond to ARP requests while stopped in the debugger. I use the following in my debugger-launch shell script to set the ARP entry if it doesn't already exist:
if !(arp -a -n -i en0 | grep '10\.211\.55\.10[)] at 0:1c:42:d7:29:47 on en0 permanent' > /dev/null) ; then
echo "Adding arp entry"
sudo arp -s 10.211.55.10 00:1c:42:d7:29:47
fi
Someone more expert could probably improve on my bit of shell script.
All of the above is documented in http://developer.apple.com/library/mac/documentation/Darwin/Conceptual/KernelProgramming/KernelProgramming.pdf.
The answer is simply to make sure the target has a kernel panic before you try to attach gdb from the host.
I have a C++ application that uses ssh to summon a connection to the server. I find that sometimes the ssh session is left lying around long after the command to summon the server has exited. Looking at the Centos4 man page for ssh I see the following:
The session terminates when the command or shell on the remote machine
exits and all X11 and TCP/IP connections have been closed. The exit
status of the remote program is returned as the exit status of ssh.
I see that the command has exited, so I imagine not all the X11 and TCP/IP connections have been closed. How can I figure out which of these ssh is waiting for so that I can fix my summon command's C++ application to clean up whatever is being left behind that keeps the ssh open.
I wonder why this failure only occurs some of the time and not on every invocation? It seems to occur approximately 50% of the time. What could my C++ application be leaving around to trigger this?
More background: The server is a daemon, when launched, it forks and the parent exits, leaving the child running. The client summons by using:
popen("ssh -n -o ConnectTimeout=300 user#host \"sererApp argsHere\""
" 2>&1 < /dev/null", "r")
Use libssh or libssh2, rather than calling popen(3) from C only to invoke ssh(1) which itself is another C program. If you want my personal experience, I'd say try libssh2 - I've used it in a C++ program and it works.
I find some hints here:
http://www.snailbook.com/faq/background-jobs.auto.html
This problem is usually due to a feature of the OpenSSH server. When writing an SSH server, you have to answer the question, "When should the server close the SSH connection?" The obvious answer might seem to be: close it when the server-side user program started by client request (shell or remote command) exits. However, it's actually a bit more complicated; this simple strategy allows a race condition which can cause data loss (see the explanation below). To avoid this problem, sshd instead waits until it encounters end-of-file (eof) on the pipes connecting to the stdout and stderr of the user program.
#sienkiew: If you really want to execute a command or script via ssh and exit, have a look at the daemontool of the libslack package. (Similar tools that can detach a command from its standard streams would be screen, tmux or detach.)
To inspect stdin, stdout & stderr of the command executed via ssh on the command line, you can, for example, use lsof.
# sample code to figure out why ssh session does not exit
# sleep keeps its stdout open, so sshd only sees EOF after command completion
ssh localhost 'sleep 10 &' # blocks
ssh localhost 'sleep 10 1>&- &' # does not block
ssh localhost 'sleep 10 & lsof -p ${!}'
ssh localhost 'sleep 10 1>&- & lsof -p ${!}'
ssh localhost 'sleep 10 1>/dev/null & lsof -p ${!}'
ssh localhost 'sleep 10 1>/dev/null 2>&1 & lsof -p ${!}'