Linux BTF: bpftool: Failed to get EHDR from /sys/kernel/btf/vmlinux - c++

I am trying to start with BPF CO:RE Development.
Using Ubuntu 20.04 LTS in a VM, I needed to recompile the kernel and install pahole (from apt install dwarves) so that BTF is enabled (I set CONFIG_DEBUG_FS=y and CONFIG_DEBUG_INFO_BTF=y).
So my setup is:
Ubuntu 20.04
Kernel 5.4.0-90-generic
bpftool --version: /usr/lib/linux-tools/5.4.0-90-generic/bpftool v5.4.148
/sys/kernel/btf/vmlinux exists and can be read out with cat.
But bpftool shows the following error:
$ sudo bpftool btf dump file /sys/kernel/btf/vmlinux format c
libbpf: failed to get EHDR from /sys/kernel/btf/vmlinux
Error: failed to load BTF from /sys/kernel/btf/vmlinux: Unknown error -4001
From https://github.com/libbpf/libbpf/blob/master/src/libbpf.h
it looks like it is LIBBPF_ERRNO__FORMAT, /* BPF object format invalid */
but I can not find out what's wrong.
Does anybody know where the mistake might be?
Thanks in advance!
EDIT: Added bpftool version

You need to update bpftool to support a fallback to reading BTF as raw data if the input file is not an object file. The minimum bpftool version required is v5.5 as that's the Linux release where the patch landed. In general, I would recommend to always use the latest bpftool version as there are no backports.

Update:
It looks like bpftool only accepts a ELF-file with the compiled runnning kernel in it, but my /sys/kernel/btf/vmlinux is not:
$ file /sys/kernel/btf/vmlinux
/sys/kernel/btf/vmlinux: data
Same for /boot/vmlinuz:
$ sudo file /boot/vmlinuz-5.4.0-90-generic
/boot/vmlinuz-5.4.0-90-generic: Linux kernel x86 boot executable bzImage, version 5.4.0-90-generic (root#elde-dev) #101+test1 SMP Tue Nov 23 16:38:41 UTC 2021, RO-rootFS, swap_dev 0xD, Normal VGA
Does anybody know why my /sys/kernel/btf/vmlinux does not show the right format?
I found this workaround:
Using this script (https://elixir.bootlin.com/linux/latest/source/scripts/extract-vmlinux) as suggested here (https://unix.stackexchange.com/questions/610672/where-is-the-linux-kernel-elf-file-located) I could get the "working" vmlinux-file which then could be read by bpftool. But this can not really be the right way for BPF CO:RE I guess... Also, in all the tutorials, bpftool is used directly with /sys/kernel/btf/vmlinux.
So why do I get the wrong format?
EDIT: As suggested above, just downoad the newest linux kernel, compile bpftool from there and use that.

Related

No source file for Netaccel_link error on running program

I have an OCaml program that worked fine on Ubuntu 16 but when recompiled and run on Ubuntu 20 I get the following error:-
$ ocamldebug ./linearizer
OCaml Debugger version 4.08.1
(ocd) r
Loading program... done.
Time: 89534
Program end.
Uncaught exception: Sys_error "Illegal seek"
(ocd) b
Time: 89533 - pc: 624888 - module Netaccel_link
No source file for Netaccel_link.
I thought this was due to missing dev libraries but:-
$ sudo apt install libocamlnet-ocaml-dev
Reading package lists... Done
Building dependency tree
Reading state information... Done
libocamlnet-ocaml-dev is already the newest version (4.1.6-1build6).
0 upgraded, 0 newly installed, 0 to remove and 20 not upgraded.
What setup step am I missing on Ubuntu 20?
This looks like a regression bug in libocamlnet and you should report an issue there or, I am a bit pessimistic that you will get any response, you can try to debug the issue yourself.
The problem that you are facing has nothing to do with missing libraries (they will be reported during installation or, if the package is broken, end up in linker errors). It may result, however, from some misconfiguration of the system. If that is true, then you're lucky as you can fix it yourself.
I will give you some advice that might help you in debugging this issue. For more, please try using discuss.ocaml.org as a more suitable media (SO doesn't favor this kind of a discussion and we might get deleted by admins).
The illegal seek exception is thrown when the seek operation is applied on a non-regular file, aka ESPIPE Unix error. So check your inputs. It could be that what was previously regarded as a file in Ubuntu is now a pipe or a socket.
Try to use ltrace or strace to pinpoint the culprit e.g.,
ltrace ./linearizer
or, if it overwhelms you, try strace
strace ./linearizer
Instead of using ocamldebug you can use plain gdb. You can use gdb's interfaces to provide the path to the source code (though most likely it won't work since ocamlnet is not compiled with debug information). I believe that it will give you a more meaningful backtrace.
Instead of using the system installation try using opam. Install your dependencies with opam and try older versions as well as newer versions of the OCaml compiler. Also, try different versions of ocamlnet. Ideally, try to reproduce the environment that used to work for you.
When nothing else works, you can use objdump -d and look at the disassembly of your binary. OCaml is using a pretty readable and intuitive name mangling scheme (<module_name>__<function_name>_<uid>), so you can easily find the source code (search for <module_name>.ml file and look for the <function_name> there)
Finally, just use docker or any other container to run your application. Consider switching from ocamlnet to something more modern and supported.

Openocd Error: invalid command name "dap" - can't connect Blue Pill via ST-Link/V2

I'm using a Blue Pill board (STM32F103CB with 128kB of flash according to st-info --probe) via a clone ST-Link/V2 like this one. I've also tested using a genuine ST-Link/V2 like this one. I get the same result, described below, with both programmers.
My system is Linux (Debian LXDE) and I've installed OpenOCD from Liviu Ionescu's releases here.
My OpenOCD installation is working. As well as the Blue Pill I have a ST-Nucleo-F103RB board, and I can connect to it using OpenOCD. The command
openocd -f board/st_nucleo_f103rb.cfg
using the standard .cfg file that ships with OpenOCD gives
Open On-Chip Debugger 0.10.0
Licensed under GNU GPL v2
For bug reports, read
http://openocd.org/doc/doxygen/bugs.html
Info : The selected transport took over low-level target control. The results might differ compared to plain JTAG/SWD
adapter speed: 1000 kHz
adapter_nsrst_delay: 100
none separate
srst_only separate srst_nogate srst_open_drain connect_deassert_srst
Info : Unable to match requested speed 1000 kHz, using 950 kHz
Info : Unable to match requested speed 1000 kHz, using 950 kHz
Info : clock speed 950 kHz
Info : STLINK v2 JTAG v29 API v2 SWIM v18 VID 0x0483 PID 0x374B
Info : using stlink api v2
Info : Target voltage: 3.271135
Info : stm32f1x.cpu: hardware has 6 breakpoints, 4 watchpoints
But I still haven't managed to connect to my Blue Pill using the ST-Link/V2 programmers. I've read everything I can find, including relevant sections of https://elinux.org/Category:OpenOCD and as much as I can personally digest of http://openocd.org/doc/. The following is where I've got to.
The .cfg file stm32f103c8_blue_pill.cfg doesn't work for me. It produces the output described below.
Based on what I've read I've prepared my own .cfg file at ../board/stm32f103.cfg. It says:
source [find interface/stlink.cfg]
transport select hla_swd
source [find target/stm32f1x.cfg]
#source [find board/stm32f103c8_blue_pill.cfg]
#reset_config srst_only
#reset_config none separate
Sources I've read suggest this should work, but it doesn't. Using my .cfg described above, it I can use either target/stm32f1x.cfg or board/stm32f103c7_blue_pill.cfg, and I still get the same output as described below. (In the case of both of those .cfg files I'm using the standard files, as shipped with OpenOCD.) I've tested with both of the reset_config variants shown above, and with neither. None of the combinations works.
The file interface/stlink.cfg that I'm using is modified. I've changed it to state the correct device_desc "ST-LINK/V2" and the correct vid_pid 0x0483 0x3748. (Both confirmed using lsusb.) So, ignoring commented lines, stlink.cfg reads
interface hla
hla_layout stlink
hla_device_desc "ST-LINK/V2"
hla_vid_pid 0x0483 0x3748
I've experimented with including the hla_serial of the programmer. Interestingly, lsusb can't find the full serial number. st-info --probe finds the serial number, but gives a slightly different number from the STLinkUpgrade firmware application. I've tried using both serial numbers. No difference.
Here's the command I give to OpenOCD:
openocd -s ~/stm32/openocd/scripts -f board/stm32f103.cfg
Notice that I have to set the path using -s for this command. With the ST-Nucleo-F103RB board, I don't have to do this. With the stm32f103.cfg file, however, if I don't set the path I get:
Error: Can't find board/stm32f103.cfg
in procedure 'script'
If I use the full command shown above, with -s to set the path, I get:
Open On-Chip Debugger 0.10.0
Licensed under GNU GPL v2
For bug reports, read
http://openocd.org/doc/doxygen/bugs.html
/[..]stm32/openocd/scripts/target/stm32f1x.cfg:47: Error: invalid command name "dap"
in procedure 'script'
at file "embedded:startup.tcl", line 60
at file "/[..]stm32/openocd/scripts/board/stm32f103.cfg", line 18
at file "/[..]stm32/openocd/scripts/target/stm32f1x.cfg", line 47
Here's the offending line 47 of stm32f1x.cfg:
dap create $_CHIPNAME.dap -chain-position $_CHIPNAME.cpu
I've searched for items on Stackoverflow/ similar about Error: invalid command name "dap". Using the OpenOCD documentation I understand that the dap create command exists, and roughly what it does. The most similar reported error I've found documented is at https://elinux.org/OpenOCD_Troubleshooting:_Invalid_Command_Name_JTAG, and the solution suggested there doesn't seem to be applicable because I'm not invoking interface/stlink.cfg from the command line.
I can't see what I'm doing wrong, and I'm now completely stuck. If someone can give me a steer I'd be really grateful. Sorry it's such a long post.
I just encountered this problem too. Officially there are no binaries provided, only source code. But there are two sites which release binaries was recommended by OpenOCD official:
1. Maintained by Freddie Chopin.
2. Maintained by Liviu Ionescu in Github.
I tried the latest version(OpenOCD 0.10.0 commit date: 2017-01-22 20:31:28 build date: 2017-01-23) released from Freddie Chopin's site, and I encountered this Error: invalid command name "dap" problem. But all *.cfg files I referenced had ran normally in my another computer with another OpenOCD binary(although I forgot where did I download that binary).
Not sure what went wrong, so I turned to the latest version(gnu-mcu-eclipse-openocd-0.10.0-11-20190118-1134-win64.zip) released by GNU MCU Eclipse(maintained by Liviu Ionescu), the error was gone, problem solved.
PS: I'm not saying there is a bug in Freddie Chopin's build, but if someone encountered this problem, maybe you can solve it by trying the version which is currently under actively maintained.
Agree with Wulfric, using standard install for OpenOCD
sudo apt install openocd
gave the "dap" error.
However the github version openocd-xpack worked correctly.
Using:
Linux clamps 4.15.0-66-generic #75-Ubuntu SMP ... as remote, Windows 8/10 as client target MIMXRT1010-EVK

Installing HDF5 library on Cygwin: "make check" stuck at testswmr.sh, no error message

I am currently installing the HDF5 library, more precisely the hdf5-1.10.0-patch1, on Cygwin, as I want to use it with Fortran. Following the instructions from the hdfgroup website
(here is the link), I did the following:
./configure --enable-fortran
make > "out1_check.txt" 2> "warn1_check.txt" &
make check > "out2_check.txt" 2> "warn2_check.txt" &
The execution of the last command (make check) proceeds as it should, until it gets stuck. The process does not stop and something is happening (8-12% CPU are in use by sh.exe, already 39 hours of CPU time) but "out2_check.txt" looks like
Making check in src
...
[many successful checks]
...
============================
No need to test testlinks_env.sh again.
============================
============================
Testing testswmr.sh
Unfortunately, I do not have the output file from the first run of make check, but it did not contain more information on Testing testswmr.sh. There was never any error message.
So, what is this testswmr.sh, why does it get stuck and how can I finalize the installation process? Maybe I can skip the remaining checks and just proceed to make install?
Important note: an older version of HDF5 is already installed from the Cygwin repo. It does not seem to support Fortran however, so I decided to install the current version myself.
Available (and used) compilers are gcc and gfortran.
As far as I can tell, only Intel Fortran is supported on Windows. There is no Cygwin download here https://support.hdfgroup.org/HDF5/release/obtain518.html and I have never come across a report of experience for Cygwin/Fortran/HDF5.
Your options:
Use Intel Fortran
Use Linux or Mac
Sorry

Valgrind: fatal error: memcheck.h: No such file or directory

We are trying to track down a Conditional jump or move depends on uninitialised value in a C++ project reported by Valgrind. The address provided in the finding is not really helpful because it points to the end of a GCC extended assembly block, and not the actual variable causing the trouble.
According to the Valgrind's Eliminating undefined values with Valgrind, the easy way, we can use VALGRIND_CHECK_MEM_IS_DEFINED or VALGRIND_CHECK_VALUE_IS_DEFINED after including <memcheck.h>. Additionally, those macros or functions are apparently documented in the header file (there is definitely no man page for them).
However, when I include <memcheck.h> or <valgrind/memcheck.h>, it results in:
fatal error: memcheck.h: No such file or directory
Based on Stack Overflow's How do I find which rpm package supplies a file I'm looking for?, I performed a RPM file search, but its returning 0 hits for memcheck.h.
QUESTIONS
The blog article is a bit dated. Does the information still apply?
If the information is accurate, then where do I find memcheck.h?
$ uname -a
Linux localhost.localdomain 4.1.4-200.fc22.x86_64 #1 SMP Tue Aug 4 03:22:33 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
$ g++ --version
g++ (GCC) 5.1.1 20150618 (Red Hat 5.1.1-4)
...
$ valgrind --version
valgrind-3.10.1
You have to install the RPM valgrind-devel which contains memcheck.h.
The *-devel packages are typically located in the "optional" repositories (e.g. rhel-x86_64-server-optional-6 on RHEL 6). Also, you can find the RPM on Google, download it, and install it on its own. With either approach, memcheck.h is typically placed in /usr/include/valgrind once installed.
Another way to dig into uninitialised value error with valgrind is to
use the embedded gdbserver.
You can then put breakpoints in your program, and interactively
check the definedness of various addresses/length using various
memcheck monitor commands such as:
check_memory [addressable|defined] <addr> [<len>]
check that <len> (or 1) bytes at <addr> have the given accessibility
and outputs a description of <addr>
See e.g. http://www.valgrind.org/docs/manual/mc-manual.html#mc-manual.monitor-commands
for more information

in the code (.c file) how I can find the linux distribution name version

I'd like to know in the code (.c file) how I can find the linux distribution name version (like ubuntu 10.0.4, or centOS 5.5...)?
The c function that I'm looking for should be like the uname() system call used in (.c files) to get kernel version.
it will be appreciated that the function is working for all linux distribution (standard)
I 'm not looking to get distribution name and version by the use of command line linux from code (.c file) (like the use of system("cat /etc/release");).
Any suggestion will be appreciated!
Regards
There is no standard for this yet. You can query following files or check for existence:
/etc/lsb-release
/etc/issue
/etc/*release
/etc/*version
Well, you can (and should) use fopen and fgets instead of system("cat"), for reading /etc/release.
There's no universal method though, I can even build a linux image which has no filesystem at all (except initramfs) and definitely no distribution name.
AFAIK there isn't a standard system call to get this if uname(2) doesn't give you enough info.
Safest approach is probably to check for "/proc/version" and read that
You could fopen("/etc/lsb-release") and parse its contents. It looks like this:
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=10.04
DISTRIB_CODENAME=lucid
DISTRIB_DESCRIPTION="Ubuntu 10.04.3 LTS"
This method is not universal. You'll need to make sure that it works on all distros that you care about (if it doesn't, I suggest you go with #ott--'s answer).
Is it acceptable to run some shell commands?
$ /usr/bin/lsb_release -r
Release: 11.04
$ /usr/bin/lsb_release -d
Description: Ubuntu 11.04
$ /usr/bin/lsb_release -rd
Description: Ubuntu 11.04
Release: 11.04
There is no portable way to do that, you'll have to use some OS detection tool/library.
Fortunately, there are a few out there. I know those 2 :
Facter, a professional (yet free/open) information gathering program in ruby : http://puppetlabs.com/puppet/related-projects/facter/
a shell script : http://www.novell.com/coolsolutions/feature/11251.html
(I used facter via puppet and it is very good.)
With a little additional scripting, you can use one of those program's output to generate a .h that you can then use in your code.
You can even integrate this generation as a step in your makefile.
I usually inspect /etc/issue; while (as others pointed out) it is not guaranteed, I've fount in the field that's quite reliable.
As far as I've experienced, it works on ubuntu, debian, redhat, centos, slackware and archlinux.