I'm new to kernel development and I would like to know how to run/debug the linux kernel using QEMU and gdb. I'm actually reading Robert Love's book but unfortunately it doesn't help the reader on how to install proper tools to run or debug the kernel... So what I did was to follow this tutorial http://opensourceforu.efytimes.com/2011/02/kernel-development-debugging-using-eclipse/. I'm using eclipse as an IDE to develop on the kernel but I wanted first to get it work under QEMU/gdb. So what I did so far was:
1) To compile the kernel with:
make defconfig (then setting the CONFIG_DEBUG_INFO=y in the .config)
make -j4
2) Once the compilation is over I run Qemu using:
qemu-system-x86_64 -s -S /dev/zero -kernel /arch/x86/boot/bzImage
which launch the kernel in "stopped" state
3) Thus I have to use gdb, I try the following command:
gdb ./vmlinux
which run it correctly but... Now I don't know what to do... I know that I have to use remote debugging on the port 1234 (default port used by Qemu), using the vmlinux as the symbol table file for debugging.
So my question is: What should I do to run the kernel on Qemu, attach my debugger to it and thus, get them work together to make my life easier with kernel development.
I'd try:
(gdb) target remote localhost:1234
(gdb) continue
Using the '-s' option makes qemu listen on port tcp::1234, which you can connect to as localhost:1234 if you are on the same machine. Qemu's '-S' option makes Qemu stop execution until you give the continue command.
Best thing would probably be to have a look at a decent GDB tutorial to get along with what you are doing. This one looks quite nice.
Step-by-step procedure tested on Ubuntu 16.10 host
To get started from scratch quickly I've made a minimal fully automated QEMU + Buildroot example at: https://github.com/cirosantilli/linux-kernel-module-cheat/blob/c7bbc6029af7f4fab0a23a380d1607df0b2a3701/gdb-step-debugging.md Major steps are covered below.
First get a root filesystem rootfs.cpio.gz. If you need one, consider:
a minimal init-only executable image: https://unix.stackexchange.com/questions/122717/custom-linux-distro-that-runs-just-one-program-nothing-else/238579#238579
a Busybox interactive system: https://unix.stackexchange.com/questions/2692/what-is-the-smallest-possible-linux-implementation/203902#203902
Then on the Linux kernel:
git checkout v4.15
make mrproper
make x86_64_defconfig
cat <<EOF >.config-fragment
CONFIG_DEBUG_INFO=y
CONFIG_DEBUG_KERNEL=y
CONFIG_GDB_SCRIPTS=y
EOF
./scripts/kconfig/merge_config.sh .config .config-fragment
make -j"$(nproc)"
qemu-system-x86_64 -kernel arch/x86/boot/bzImage \
-initrd rootfs.cpio.gz -S -s \
-append nokaslr
On another terminal, from inside the Linux kernel tree, supposing you want to start debugging from start_kernel:
gdb \
-ex "add-auto-load-safe-path $(pwd)" \
-ex "file vmlinux" \
-ex 'set arch i386:x86-64:intel' \
-ex 'target remote localhost:1234' \
-ex 'break start_kernel' \
-ex 'continue' \
-ex 'disconnect' \
-ex 'set arch i386:x86-64' \
-ex 'target remote localhost:1234'
and we are done!!
For kernel modules see: How to debug Linux kernel modules with QEMU?
For Ubuntu 14.04, GDB 7.7.1, hbreak was needed, break software breakpoints were ignored. Not the case anymore in 16.10. See also: https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/901944
The messy disconnect and what come after it are to work around the error:
Remote 'g' packet reply is too long: 000000000000000017d11000008ef4810120008000000000fdfb8b07000000000d352828000000004040010000000000903fe081ffffffff883fe081ffffffff00000000000e0000ffffffffffe0ffffffffffff07ffffffffffffffff9fffff17d11000008ef4810000000000800000fffffffff8ffffffffff0000ffffffff2ddbf481ffffffff4600000010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000007f0300000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000801f0000
Related threads:
https://sourceware.org/bugzilla/show_bug.cgi?id=13984 might be a GDB bug
Remote 'g' packet reply is too long
http://wiki.osdev.org/QEMU_and_GDB_in_long_mode osdev.org is as usual an awesome source for these problems
https://lists.nongnu.org/archive/html/qemu-discuss/2014-10/msg00069.html
nokaslr: https://unix.stackexchange.com/questions/397939/turning-off-kaslr-to-debug-linux-kernel-using-qemu-and-gdb/421287#421287
Known limitations:
the Linux kernel does not support (and does not even compile without patches) with -O0: How to de-optimize the Linux kernel to and compile it with -O0?
GDB 7.11 will blow your memory on some types of tab completion, even after the max-completions fix: Tab completion interrupt for large binaries Likely some corner case which was not covered in that patch. So an ulimit -Sv 500000 is a wise action before debugging. Blew up specifically when I tab completed file<tab> for the filename argument of sys_execve as in: https://stackoverflow.com/a/42290593/895245
See also:
https://github.com/torvalds/linux/blob/v4.9/Documentation/dev-tools/gdb-kernel-debugging.rst official Linux kernel "documentation"
Linux kernel live debugging, how it's done and what tools are used?
When you try to start vmlinux exe using gdb, then first thing on gdb is to issue cmds:
(gdb) target remote localhost:1234
(gdb) break start_kernel
(continue)
This will break the kernel at start_kernel.
BjoernID's answer did not really work for me. After the first continuation, no breakpoint is reached and on interrupt, I would see lines such as:
0x0000000000000000 in ?? ()
(gdb) break rapl_pmu_init
Breakpoint 1 at 0xffffffff816631e7
(gdb) c
Continuing.
^CRemote 'g' packet reply is too long: 08793000000000002988d582000000002019[..]
I guess this has something to do with different CPU modes (real mode in BIOS vs. long mode when Linux has booted). Anyway, the solution is to run QEMU first without waiting (i.e. without -S):
qemu-system-x86_64 -enable-kvm -kernel arch/x86/boot/bzImage -cpu SandyBridge -s
In my case, I needed to break at something during boot, so after some deciseconds, I ran the gdb command. If you have more time (e.g. you need to debug a module that is loaded manually), then the timing doesn't really matter.
gdb allows you to specify commands that should be run when started. This makes automation a bit easier. To connect to QEMU (which should now already be started), break on a function and continue execution, use:
gdb -ex 'target remote localhost:1234' -ex 'break rapl_pmu_init' -ex c ./vmlinux
As for me the best solution for debugging the kernel - is to use gdb from Eclipse environment. You should just set appropriate port for gdb (must be the same with one you specified in qemu launch string) in remote debugging section. Here is the manual:
http://www.sw-at.com/blog/2011/02/11/linux-kernel-development-and-debugging-using-eclipse-cdt/
On Linux systems, vmlinux is a statically linked executable file that contains
the Linux kernel in one of the object file formats supported by Linux, which
includes ELF, COFF and a.out. The vmlinux file might be required for kernel
debugging, symbol table generation or other operations, but must be made
bootable before being used as an operating system kernel by adding a multiboot
header, bootsector and setup routines.
An image of this initial root file system must be stored somewhere accessible
by the Linux bootloader to the boot firmware of the computer. This can be the
root file system itself, a boot image on an optical disc, a small partition on
a local disk (a boot paratition, usually using ext4 or FAT file systems), or a
TFTP server (on systems that can boot from Ethernet).
Compile linux kernel
Build the kernel with this series applied, enabling CONFIG_DEBUG_INFO (but leave CONFIG_DEBUG_INFO_REDUCED off)
https://www.kernel.org/doc/html/latest/admin-guide/README.html
https://wiki.archlinux.org/index.php/Kernel/Traditional_compilation
https://lwn.net/Articles/533552/
Install GDB and Qemu
sudo pacman -S gdb qemu
Create initramfs
#!/bin/bash
# Os : Arch Linux
# Kernel : 5.0.3
INIT_DIR=$(pwd)
BBOX_URL="https://busybox.net/downloads/busybox-1.30.1.tar.bz2"
BBOX_FILENAME=$(basename ${BBOX_URL})
BBOX_DIRNAME=$(basename ${BBOX_FILENAME} ".tar.bz2")
RAM_FILENAME="${INIT_DIR}/initramfs.cpio.gz"
function download_busybox {
wget -c ${BBOX_URL} 2>/dev/null
}
function compile_busybox {
tar xvf ${BBOX_FILENAME} && cd "${INIT_DIR}/${BBOX_DIRNAME}/"
echo "[*] Settings > Build options > Build static binary (no shared libs)"
echo "[!] Please enter to continue"
read tmpvar
make menuconfig && make -j2 && make install
}
function config_busybox {
cd "${INIT_DIR}/${BBOX_DIRNAME}/"
rm -rf initramfs/ && cp -rf _install/ initramfs/
rm -f initramfs/linuxrc
mkdir -p initramfs/{dev,proc,sys}
sudo cp -a /dev/{null,console,tty,tty1,tty2,tty3,tty4} initramfs/dev/
cat > "${INIT_DIR}/${BBOX_DIRNAME}/initramfs/init" << EOF
#!/bin/busybox sh
mount -t proc none /proc
mount -t sysfs none /sys
exec /sbin/init
EOF
chmod a+x initramfs/init
cd "${INIT_DIR}/${BBOX_DIRNAME}/initramfs/"
find . -print0 | cpio --null -ov --format=newc | gzip -9 > "${RAM_FILENAME}"
echo "[*] output: ${RAM_FILENAME}"
}
download_busybox
compile_busybox
config_busybox
Boot Linux Kernel With Qemu
#!/bin/bash
KER_FILENAME="/home/debug/Projects/kernelbuild/linux-5.0.3/arch/x86/boot/bzImage"
RAM_FILENAME="/home/debug/Projects/kerneldebug/initramfs.cpio.gz"
qemu-system-x86_64 -s -kernel "${KER_FILENAME}" -initrd "${RAM_FILENAME}" -nographic -append "console=ttyS0"
$ ./qemuboot_vmlinux.sh
SeaBIOS (version 1.12.0-20181126_142135-anatol)
iPXE (http://ipxe.org) 00:03.0 C980 PCI2.10 PnP PMM+07F92120+07EF2120 C980
Booting from ROM...
Probing EDD (edd=off to disable)... o
[ 0.019814] Spectre V2 : Spectre mitigation: LFENCE not serializing, switching to generic retpoline
can't run '/etc/init.d/rcS': No such file or directory
Please press Enter to activate this console.
/ # uname -a
Linux archlinux 5.0.3 #2 SMP PREEMPT Mon Mar 25 10:27:13 CST 2019 x86_64 GNU/Linux
/ #
Debug Linux Kernel With GDB
~/Projects/kernelbuild/linux-5.0.3 ➭ gdb vmlinux
...
(gdb) target remote localhost:1234
Remote debugging using localhost:1234
0xffffffff89a4b852 in ?? ()
(gdb) break start_kernel
Breakpoint 1 at 0xffffffff826ccc08
(gdb)
Display all 190 possibilities? (y or n)
(gdb) info functions
All defined functions:
Non-debugging symbols:
0xffffffff81000000 _stext
0xffffffff81000000 _text
0xffffffff81000000 startup_64
0xffffffff81000030 secondary_startup_64
0xffffffff810000e0 verify_cpu
0xffffffff810001e0 start_cpu0
0xffffffff810001f0 __startup_64
0xffffffff81000410 pvh_start_xen
0xffffffff81001000 hypercall_page
0xffffffff81001000 xen_hypercall_set_trap_table
0xffffffff81001020 xen_hypercall_mmu_update
0xffffffff81001040 xen_hypercall_set_gdt
0xffffffff81001060 xen_hypercall_stack_switch
0xffffffff81001080 xen_hypercall_set_callbacks
0xffffffff810010a0 xen_hypercall_fpu_taskswitch
0xffffffff810010c0 xen_hypercall_sched_op_compat
0xffffffff810010e0 xen_hypercall_platform_op
Related
I'm making a college assignement where I need to add some modules to a simple SO (EPOS) and to do that I need to be able to compile it using a cross-compiler and also debug it using gdb.
I'm having success on compiling it, and also on running it o QEMU, but when it comes to debugging it, this error shows up:
qemu-system-riscv32 -machine virt -cpu rv32 -smp 1 -gdb tcp::1235 -S -m 0x00020000k -serial mon:stdio -bios none -nographic -no-reboot -device loader,file=hello.img,addr=0x80100000,force-raw=on -kernel hello.img | tee hello.out &
sh -e /home/felipe/UFSC/SO-II/rv32/bin/riscv32-unknown-linux-gnu-gdb -ex "target remote:1235" -ex "set confirm off" -ex "add-symbol-file /home/felipe/UFSC/SO-II/epos/app/hello/hello 0x800000a0"
/home/felipe/UFSC/SO-II/rv32/bin/riscv32-unknown-linux-gnu-gdb: 1: ELF: not found
my professor asked me to test these commands by themselves, and I did. This one:
qemu-system-riscv32 -machine virt -cpu rv32 -smp 1 -gdb tcp::1235 -S -m 0x00020000k -serial mon:stdio -bios none -nographic -no-reboot -device loader,file=hello.img,addr=0x80100000,force-raw=on -kernel hello.img
works fine. But this one:
sh -e /home/felipe/UFSC/SO-II/rv32/bin/riscv32-unknown-linux-gnu-gdb -ex "target remote:1235" -ex "set confirm off" -ex "add-symbol-file /home/felipe/UFSC/SO-II/epos/app/hello/hello 0x800000a0"
is the one responsible for the returning error:
/home/felipe/UFSC/SO-II/rv32/bin/riscv32-unknown-linux-gnu-gdb: 1: ELF: not found
(after "ELF" there are 5 squares, like those that show when you receive and emoji you don't have)
I tested gdb with a simple C program and it worked, so I know that QEMU nor gdb are the problem here.
Its my first time working with this type of development in C/C++ and using QEMU and GDB. I'm not sure if there is anything else I need to share with you regarding code besides the repo link above. If I'm missing something, let me know.
Also, I'm working on WSL2 (using Ubuntu 20.04) but "pure" ubuntu answers are also helpful.
Thank you.
EDIT: after posting I realized that the mentioned squares after ELF are not showing here. So I'm posting a print screen of it:
terminal error
How do I attach gdb to ARM Qemu board with each smp running different kernels? When I use gdb options, I can only specify one kernel with the file option in gdb.
Qemu Command :
qemu-system-aarch64 -M virt -smp 2 \
-display none -nographic \
-device loader,file=f1.axf,cpu-num=0 \
-device loader,file=f2.axf,cpu-num=1 -s -S
gdb commands ran:
gdb-multiarch
target remote localhost:1234
file f1.axf
After this, gdb shows two threads, both showing debug source as f1.axf.
If I pass f2.axf in file option, both thread show source and debug info from f2.axf.
There is no error message from gdb
Setup:
Host: Ubuntu 18.04, 64bits
Guest: Qemu Arm
GDB Multiarch: Running on Host machine(Ubuntu)
I had to add each smp cpu as an Arm Cpucluster in my Qemu board file. Make sure that you assign different cluster index to each cpu, else they will attach to same GDB. So for N number of clusters, you can attach N gdbs. After that gdb can be attached to Qemu listening on port 1234, using following commands:
gdb-multiarch
target extended :1234
file f1.axf
add-inferior
inferior 2
attach 2
file f2.axf
info thread
Add as many inferiors as many cpu clusters you have. To attach to cluster 4, add command attach 4 in gdb.
I'm trying to exploit the binaries from Damn vulnerable Router Firmware but I have issues with debuggging with gdb.
to run the program i use this command :
sudo chroot . ./qemu-mipsel-static ./pwnable/Intro/stack_bof_01
and it works but when i try to run gdb with :
sudo chroot . ./qemu-mipsel-static gdb ./pwnable/Intro/stack_bof_01
I have that :
(gdb) r
Starting program: /pwnable/Intro/stack_bof_01
qemu: Unsupported syscall: 4026
Cannot exec /bin/bash: No such file or directory.
qemu: Unsupported syscall: 4026
Could not open /proc/12532/status
I tried to copy the binary in a qemu VM but I don't have the whole system so it don't work.
So , please , what's is the best way to debug a program from a firmware on a different architecture than x86 ?
In qemu user mode, run the program using the command with the option -g:
sudo chroot . ./qemu-mipsel-static -g 1234 ./pwnable/Intro/stack_bof_01
then start the gdb-multiarch (or gdb that corresponds to that architecture), and attach to it like this:
target remote 127.0.0.1:1234
then you can debug it happily.
Qemu terminated with the log : "QEMU: Terminated via GDBstub" when I tried to connect to QEmu from GDB .
I started the QEMU with the following command in one terminal :
qemu-system-arm -serial telnet:localhost:1235,server,nowait,ipv4 -serial telnet:localhost:1236,server,nowait,ipv4 -serial telnet:localhost:1238,server,nowait,ipv4 -gdb tcp:localhost:1234,server,ipv4 -kernel ./build/final.elf -M versatilepb -nographic -m 256 -S
And then in another terminal I started GDB with the command :
arm-none-eabi-gdb --command=~/.gdbinit
And the file .gdbinit contains the text:
set history save on
set logging on
target remote localhost:1234
load ./build/final.elf
sym ./build/final.elf
b break_virtual
Can you please let me know whats going wrong here?
GDB automagically loads ~/.gdbinit
so when you load .gdbinit via --command=~/.gdbinit
it runs the script twice,
when it gets to the 2nd invocation of target remote localhost:1234
gdb hangs up its initial connection, qemu quits,
then gdb fails to reconnect to it because it is no longer running.
Either get rid of the --command option or rename the file.
I have a program using LD_PRELOAD. The program should be run like
this, "LD_PRELOAD=/path/to/libfoo.so qemu -U LD_PRELOAD a.out", if
without gdb.
Here are what I did while running gdb.
(gdb) set environment LD_PRELOAD=/nfs_home/chenwj/tools/lib/libdbo.so
(gdb) file /nfs_home/chenwj/tools/bin/qemu-i386
(gdb) r -U LD_PRELOAD bzip2_base.i386-m32-gcc44-annotated input.source 1
But gdb gave me the error below
Starting program: /nfs_home/chenwj/tools/bin/qemu-i386 -U LD_PRELOAD bzip2_base.i386-m32-gcc44-annotated input.source 1
bash: open "/bin/bash" failed: Permission denied
During startup program exited with code 66.
Any sugguestion appreciated.
Regards, chenwj
GDB does not invoke your executable directly. Instead, it does
bash -c '/nfs_home/chenwj/tools/bin/qemu-i386 -U LD_PRELOAD bzip2_base.i386-m32-gcc44-annotated input.source 1'
This is done so that bash takes care of I/O redirection (which you are not using).
My guess is that /bin/bash doesn't work when LD_PRELOAD=libdbo.so is in effect, though I don't understand the exact nature of failure.
One way to work around this problem is to create a wrapper executable, implementing C equivalent of this:
export LD_PRELOAD=/nfs_home/chenwj/tools/lib/libdbo.so
exec /nfs_home/chenwj/tools/bin/qemu-i386 "$#"
and debug that executable (without setting LD_PRELOAD). You'll see an extra SIGTRAP when the wrapper execve()s the wrapped qemu-i386, which you should ignore and continue.