Debug using gdb the initial startup of a linux daemon - c++

I want to debug the very initial startup of a daemon started as a service under linux (centos 7).
My service is started as: "service mydaemon start"
I know about attaching gdb to a running process, but, unfortunately that technique is too slow, the initial execution of mydaemon is important.
mydaemon is written in C++ and full debug info is available.

unfortunately that technique is too slow
There are two general solutions to this problem.
The first one is described here: you make your target executable wait for GDB to attach (this requires building a special version of the daemon).
The second is to "wrap" your daemon in gdbserver (as root):
mv mydaemon mydaemon.exe
echo > mydaemon <<EOF
#!/bin/sh
exec gdbserver :1234 /path/to/mydaemon.exe "$#"
EOF
chmod +x mydaemon
Now execute service mydaemon start, and your process will be stopped by gdbserver and will wait for connection from GDB.
gdb /path/to/mydaemon.exe
(gdb) target remote :1234
# You should now be looking at the mydaemon process stopped in `_start`.
At that point you can set beakpoints, and use continue or next or step as appropriate.

Related

Expose kernel for existing environment on Vertex AI User Managed Notebook startup

This is a continuation of this thread, posted here because it was too complicated for a comment.
TL;DR
In a Vertex AI User Managed Notebook, how does one retain the exposed kernel icons for existing venv (and conda, if possible) environments stored on the data disk, through repeated stop and start cycles?
Details
I am using User Managed Notebook Instances built off a Docker image. Once the Notebook is launched, I manually go in create a custom environment. For the moment, let's say this is a venv python environment. The environment works fine and I can expose the kernel so it shows as an icon in the Jupyter Lab's Launcher. If I shut the instance down and restart it, the icon is gone. I have been trying to create a start-up script that re-exposes the kernel, but it is not working properly. I have been trying to use method #2 proposed by #gogasca in the link above. Among other operations (which do execute correctly), my start-up script contains the following:
cat << 'EOF' > /home/jupyter/logs/exposeKernel.sh
#!/bin/bash
set -x
if [ -d /home/jupyter/envs ]; then
# For each env creation file...
for i in /home/jupyter/envs/*.sh; do
tempName="${i##*/}"
envName=${tempName%.*}
# If there is a corresponding env directory, then expose the kernel
if [ -d /home/jupyter/envs/${envName} ]; then
/home/jupyter/envs/${envName}/bin/python3 -m ipykernel install --prefix=/root/.local --name $envName &>> /home/jupyter/logs/log.txt
echo -en "Kernel created for: $envName \n" &>> /home/jupyter/logs/log.txt
else
echo -en "No kernels can be exposed\n" &>> /home/jupyter/logs/log.txt
fi
done
fi
EOF
chown root /home/jupyter/logs/exposeKernel.sh
chmod a+r+w+x /home/jupyter/logs/exposeKernel.sh
su -c '/home/jupyter/logs/exposeKernel.sh' root
echo -en "Existing environment kernels have been exposed\n\n" &>> /home/jupyter/logs/log.txt
I am attempting to log the operations, and I see in the log that the kernel is created successfully in the same location that it would be created if I were to manually activate the environment and expose the kernel from within. Despite the apparent success in the log (no errors, at least), the kernel icon does not appear. If I manually run the exposeKernel.sh script from the terminal using su -c '/home/jupyter/logs/exposeKernel.sh' root, it also works fine and the kernel is exposed correctly. #gogasca's comments on the aforementioned thread suggest that I should be using the jupyter user instead of root, but repeated testing and logging indicates that the jupyter user fails to execute the code while root succeeds (though neither create the kernel icon when called from the start-up script).
Questions:
(1) My goal is to automatically re-expose the existing environment kernels on startup. Presumably they disappear each time the VM is stopped and started because there is some kind of linking to the boot disk that is rebuilt each time. What is the appropriate strategy here? Is there a way to build the environments (interested in both conda and venv) so that their kernel icons don't vaporize on shut-down?
(2) If the answer to (1) is no, then why does the EOF-created file fail to accomplish the job when called from a start-up script?
(3) Tangentially related, am I correct in thinking that the post-startup-script executes only once during the initial Notebook instance creation process, while the the startup-script or startup-script-url executes each time the Notebook is started?

How to make GCP start script to start multiple processes?

So, I'm using Google Cloud Platform and set below startup script
#! /bin/bash
cd /home/user
sudo ./process1
sudo ./process2
I worried about this script because process1 blocks shell and prevent to run sudo ./process2. And it really was. process1 was started successfully but process2 was not started.
I checked that script has no problem with starting process1 and process2. Execute ./process2 via SSH worked but after I close the SSH shell and process2 was stopped too.
How can I start both process in booting time(or even after)?
I tried testing your startup script in my environment,it seems the script works well.
1.You can please try checking process1 and process2 scripts.
2.If you want your process to run in the background even after the SSH session is closed, you can use “&” { your_command & }at the end of your command.
To run a command in the background, add the ampersand symbol (&) at the end of the command:
your_command &
then the script execution continues and isn't blocked. Or use linux internal means to auto run processes on boot.

How can I detach a gdb session from outside?

I start a gdb session in the background with a command like this:
gdb --batch --command=/tmp/my_automated_breakpoints.gdb -p pid_of_proces> &> /tmp/gdb-results.log &
The & at the end lets it run in the background (and the shell is immediately closed afterwards as this command is issued by a single ssh command).
I can find out the pid of the gdb session with ps -aux | grep gdb.
However: How can I gracefully detach this gdb session from the running process just like I would if I had the terminal session in front of me with the (gdb) detach command?
When I kill the gdb session (and not the running process itself) with kill -9 gdb_pid, I get unwanted SIGABRTs afterwards in the running program.
A restart of the service is too time consuming for my purpose.
In case of a successful debugging session with this automated script I could use a detach command inside the batch script. This is however not my case: I want to detach/quit the running gdb session when there are some errors during the session, so I would like to gracefully detach gdb by hand from within another terminal session.
If you run the gdb command from terminal #1 in the background, you can always bring gdb back into foreground by running the command fg. Then, you can simply CTRL+C and detach as always to stop the debugging session gracefully.
Assuming that terminal #1 is now occupied by something else and you cannot use it, You can send a SIGHUP signal to the gdb process to detach it:
sudo kill -s SIGHUP $(pidof gdb)
(Replace the $(pidof gdb) with the actual PID if you have more than one gdb instance)

How to modify the environment variables and working directory of gdbserver --multi without restarting it?

When I run a program that prints the environment from environ locally with:
./gdb myprintenv
I can change environment variables across runs with:
run
set environment asdf=qwer
run
Is there any way to do that with gdbserver --multi?
I'm running it as:
gdbserver --multi :1234 ./myprintenv
and then locally:
arm-linux-gnueabihf-gdb -ex 'target extended-remote remotehost:1234' ./myprintenv
then the command:
set environment asdf=qwer
run
has no effect.
I can change the variables with:
asdf=qwer gdbserver --multi :1234 ./myprintenv
but that is annoying as it requires the mon exit, go to board, rerun, go to host, reconnect dance.
The same goes for working directory, which you can change with cd locally, but not on the server apparently.
One alternative would be to launch gdbserver with SSH every time without --multi, just like Eclipse does, but that has the downside that it is harder to see stdout: How can I reach STDIN/STDOUT through a gdbserver session
This feature doesn't exist in gdb yet. It's being developed though: https://sourceware.org/ml/gdb-patches/2017-08/msg00000.html

Attach valgrind with daemon and collect logs for each daemon call

I have a client server system, completely written in c++. server runs as /etc/init.d/serverd with start/stop options. Client.exe executes any command as client.exe --options. With each client call, daemon hits.
I want to attach valgrind with /etc/init.d/serverd to detect leak.
I tried below options but failed.
/usr/local/bin/valgrind --log-file=valgrind_1.log -v --trace-children=yes --leak-check=full --tool=memcheck --vgdb=yes --vgdb-error=0 /etc/init.d/ serverd start
Each time it fails to attached with daemon.
What we want is to attach valgrind with daemon at starting time [ the exact point is , I will stop daemon , attach valgrind with it and then start it again ] so that each time , execution of client.exe --options, logs should be generated for daemon in --log-file=valgrind_1.log
Does anyone have any idea about how to do the same?
It seems not possible to attach valgrind to an existing process:
http://valgrind.org/docs/manual/faq.html#faq.attach
It seems to me the best practice is to kill the daemon process, and run by yourself the executable in valgrind.
For systemd managed daemon, you can change the ExecStart= to run valgrind like following:
ExecStart={valgrind-command-with-flags} /usr/sbin/foo-daemon
Do make sure to redirect the output to a well defined location.
Caution: Daemon running with valgrind could be extremely slow and could potentially not run as expected