How can I validate a newly created instruction using LLVM? - gdb

How can I validate a newly created instruction using LLVM?
I am new to LLVM and computer architecture.
Created a new instruction of bfloat16 type arithmetic targeting the RISCV-32 architecture.
I was wondering if the output of this arithmetic instruction was correct. And I wanted to verify that the value stored in the float register is in IEEE754 bfloat16 format.
clang-14 -c -g -v --target=riscv32-unknown-elf -march=rv32izfh0p1 -menable-experimental-extensions -I/usr/include -o main.o main.c
riscv32-unknown-elf-gcc -g -o main main.o
I compiled it through the above command, and it was confirmed that the assembly code came out well as shown below.
enter image description here
Then, I tried to run compiled executable file with qemu-riscv32 and debug it using gdb. But an error illegal instruction occurred.
enter image description here
Question
I think an illegal instruction error occurred because QEMU and gdb don't have information about the new instruction I created. Other than modifying QEMU and GDB, is there any way to validate the newly created instruction?

If you extend a CPU architecture by adding new instructions to it, you need to add support for those in not just the toolchain (which will then output code using that new instruction) but also in the CPU implementation itself (so that it can execute the code your compiler now generates). You can do that either on a real hardware CPU, which in practice will start with changes to its RTL which can be tested in simulation; or you can do it on an emulated version of the CPU such as that which QEMU provides. But if you don't have an implementation of a CPU at all that has your new instructions in it, there's no point in having the compiler emit them, because you have nothing that can run the code that compiler produces.

Related

how to locate coredump function with -O2

when the application compiled with -O2 crash, how to locate the function or the code line that make the application crash?
when the application compiled with -O2 crash, how to locate the function or the code line that make the application crash?
In exactly the same way as you would for an application compiled without -O2.
when the application crash on the production environment, it's hard to locate the problem.
Your first step should be to arrange for the production environment to save a core dump somewhere, or at least to print crashing address (often logged in /var/log/messages or the like).
Once you have a core, you can use debugger, e.g. gdb a.out core and then where command will list functions leading to the crash.
If you want file and line info, you need to build a.out with the -g flag (in addition to -O2).
If you don't have a core, but do have the crashing address, then addr2line -fe a.out $address should give you the function name.

llc: unsupported relocation on symbol

Problem
llc is giving me the following error:
LLVM ERROR: unsupported relocation on symbol
Detailed compilation flow
I am implementing an LLVM frontend for a middle-level IR (MIR) of a compiler, and after I convert various methods to many bitcode files, I link them (llvm-link), optimize them (opt), convert them to machine code (llc), make them a shared library (clang for it's linker wrapper), and dynamically load them.
llc step fails for some of the methods that I am compiling!
Step 1: llvm-link: Merge many bitcode files
I may have many functions calling each other, so I llvm-link the different bitcode files that might interact with each other. This step has no issues. Example:
llvm-link function1.bc function2.bc -o lnk.bc
Step 2: opt: Run optimization passes
For now I am using the following:
opt -O3 lnk.bc -o opt.bc
This step proceeds with no issues, but that's the one that CAUSES the problem!
Also, it's necessary because in the future I will need this step to pass extra passes, e.g. loop-unroll
Step 3: llc: Generate machine code (PIC)
I am using the following command:
llc -march=thumb -arm-reserve-r9 -mcpu=cortex-a9 -filetype=obj -relocation-model pic opt.bc -o obj.o
I have kept the arch specific flags I've set just in case they contribute to the issue. I am using Position Independent Code because on next step I will be building a shared object.
This command fails with the error I've written on top of this answer.
Step 4: clang: Generate Shared Object
For the cases that Step 3 fails, this step isn't reached.
If llc succeeds, this step will succeed too!
Additional information
Configuration
The following run on an llvm3.6, which runs on an arm device.
Things I've noticed
If I omit -O3 (or any other level) with the opt step, then llc would work.
If I don't, and instead I omit them from llc, llc would still fail. Which makes me think that opt -O<level> is causing the issue.
If I use llc directly it will work, but I won't be able to run specific passes that opt allows me, so this is not a option for me.
I've faced this issue ONLY with 2 functions that I've compiled so far (from their original MIR), which use loops. The others produce working code!
If I don't use pic model at llc, it can generate obj.o, but then I'll have problems creating an .so from it!
Questions
Why is this happening??!!
Why opt has -relocation-model option? Isn't that supposed to be just an llc thing? I've tried setting this both at opt and llc to pic, but still fails.
I am using clang because it has a wrapper to a linker to get the the .so. Is there a way to do this step with an LLVM tool instead?
First of all, do not use neither llc nor opt. These are developer-side tools that should be never used in any production environment. Instead of this, implement your own proper optimization and codegeneration runtime via LLVM libraries.
As for this particular bug - the thumb codegenerator might contain some bugs. Please reduce the problem and report it. Or don't use Thumb mode at all :)

How to add a new line of code in an LLVM pass?

I am writing an LLVM pass, in which i need to add a a line of code :
list ObserverBoardInterface* ObserverList;.
I need to add this at a particular point in the program. So how would I write a pass that directly adds this line of code (what approach should i take) and how do i enter this code at a particular point using the LLVM pass (signal at which point this change needs to be made)?
To obtain a set of instructions, you can write your C/C++ code and compile it to llvm bitcode with command:
clang test.cpp -emit-llvm -S -o test.ll
then open test.ll with your favorite editor and read the set of instructions.
Once, you can write your own pass, which:
will create a function with a set of instructions obtained above and
will find a point for call of this function

TI DM3730 (Design reference: beagleboard) computes wrong floating point operation results

The Situation
We have a board with a TI DM3730 processor (also known from the Beagleboard) with a Cortex A8 core (r3p2) in use with the following parameters:
Beagleboard Reference Design: Beagleboard-xM Rev-C
Kernel version: 3.2.8
Open CV library: 2.4.6
U-Boot: uboot-2013.04
Toolchain: Sourcery CodeBench ARM 2011.03
Buildroot: 2012.02
The setup is derived from this blog
Now we have written a program (written in C++ and compiled with GCC Version 4.5.2.) which uses the OpenCV library (to calculate some scores using support vector machines) and which behaves in some strange way:
The program runs on the board in its own process using defined test data: It produces repeatedly correct results.
The program runs in two or more processes (with the same defined test data): The results start to become wrong for each process, processes die with segfaults. The last remaining process runs correctly again.
The program runs in its own process (with the same defined test data again). Additionally, another process changes some exposure settings of an attached camera: The program starts to produce wrong results.
So we assume this is a very low level floating point problem.
What we tried
The complete system (all libraries, kernel, boot loader, etc.) have been compiled with compiler flags as suggested on the pandorawiki.org regarding Floating_Point_Optimization
-O3 -mcpu=cortex-a8 -mfpu=neon -ftree-vectorize -mfloat-abi=softfp
-ffast-math -fsingle-precision-constant
We tried to enable L1NEON in Cortex-A8 aux ctrl register according to the Beagle board FAQ and tried the other options mentioned there as well, but unfortunately to no avail.
All three different behaviors are reproducible, but not in the form of a minimal working example.
The same program source and the first and second scenario run correctly on Windows (using Visual Studio) and on a desktop running Linux (GCC), so it's probably not something our code does.
So the questions are now:
Are there any other known bugs with this setup and floating point operations which we are not aware of?
Are there any known compiler options which should be set or omitted which can lead to the observed results?
If a MWE would be helpful, we will look into providing one.
Any clues are welcome.
Ok, we now use an up-to-date buildroot (2014.08) with the included toolchain (arm-buildroot-linux-uclibcgueabi-), Linux-kernel 3.9.11, boost 1.55, Qt 4.8.6, and still OpenCV 2.4.6.
When compiling, we optimize for size (–Os) and for target-optimization we only use –pipe.
The following compiler-flags are currently not used anymore:
-mcpu=cortex-a8 -mfpu=neon -ftree-vectorize -mfloat-abi=softfp -ffast-math -fsingle-precision-constant
Unfortunately, we still don't know the exact reason for the original problem, but we are quite happy that the problem went away with this setup.
So maybe this answer helps some poor soul in the future... ;)

How to map PC (ARMv5) address to source code?

I'm developing on an ARM9E processor running Linux. Sometimes my application crashes with the following message :
[ 142.410000] Alignment trap: rtspserverd (996) PC=0x4034f61c
Instr=0xe591300c Address=0x0000000d FSR 0x001
How can I translate the PC address to actual source code? In other words, how can I make sense out of this message?
With objdump. Dump your executable, then search for 4034f61c:.
The -x, --disassemble, and -l options are particularly useful.
You can turn on listings in the compiler and tell the linker to produce a map file. The map file will give you the meaning of the absolute addresses up to the function where the problem occurs, while the listing will help you pinpoint the exact location of the exception within the function.
For example in gcc you can do
gcc -Wa,-a,-ad -c foo.c > foo.lst
to produce a listing in the file foo.lst.
-Wa, sends the following options to the assembler (gas).
-a tells gas to produce a listing on standard output.
-ad tells gas to omit debug directives, which would otherwise add a lot of clutter.
The option for the GNU linker to produce a map file is -M or --print-map. If you link with gcc you need to pass the option to the linker with an option starting with -Wl,, for example -Wl,-M.
Alternatively you could also run your application in the debugger (e.g. gdb) and look at the stack dump after the crash with the bt command.