How to debug segmantation fault happening on 'stp' instruction in arm binary? - c++

My application randomly and rarely crashes with segmentation fault signal.
When coredump is opened in GDB following can be seen:
arm instruction leading to crash is:
0x7f8ea08130 fd 7b b7 a9 stp x29, x30, [sp,#-144]!
When code of crashed frame is browsed in GDB, breakpoint stops at opening curly brace of a function:
void SomeClass::someMethod(const std::string& s, int i)
>{
...
}
examining of 'sp' register gives following output:
x $sp
>~"0x7fc761a070:\t0xc761a270\n"
x $sp-144\n"
>~"0x7fc7619fe0:\t"
>&"Cannot access memory at address 0x7fc7619fe0\n"
>169^error,msg="Cannot access memory at address 0x7fc7619fe0"
stack trace seems fine and not corrupted
there are roughly 300 frames in stack and stack size limit is set to be 8192K
UPD: the pagesize in the system is 4k:
>grep -i pagesize /proc/1/smaps
KernelPageSize: 4 kB
MMUPageSize: 4 kB
What else I can check to debug this issue?

Related

Can not run correctly on NXP MIMXRT1061 CVL5A

Describe the bug
I am currently trying to run Zephyr7.0 on a development board equipped with MCU: NXP MIMXRT1061 CVL5A.
I simply compiled a sample: \samples\basic\blinky, but it couldn't run correctly.
At first I thought it was a problem with the XIP format that caused Zephyr to not be booted correctly, but then I used SWD to debug it and found that it was booted correctly.
But when calling: /zephyr/arch/arm/core/aarch32/prep_c.c: z_bss_zero(); function Zephyr went wrong
Compilation process
The board I use is mimxrt1060_evk, theoretically there is no problem, because 1061 is based on 1060
west build -p auto -b mimxrt1060_evk .\samples\basic\blinky\
west flash
Debugging process
The first debugging was broken at z_interrupt_stacks
z_arm_reset () at C:/Users/zhihao3x/work/zephyrproject/zephyr/arch/arm/core/aarch32/cortex_m\reset.S:105
105 msr BASEPRI, r0
(gdb) n
134 ldr r0, =z_interrupt_stacks
(gdb) n
135 ldr r1, =CONFIG_ISR_STACK_SIZE + MPU_GUARD_ALIGN_AND_SIZE
(gdb) n
136 adds r0, r0, r1
(gdb) n
137 msr PSP, r0
(gdb)
138 mrs r0, CONTROL
(gdb)
139 movs r1, #2
(gdb)
140 orrs r0, r1 /* CONTROL_SPSEL_Msk */
(gdb)
141 msr CONTROL, r0
(gdb)
147 isb
(gdb)
154 bl z_arm_prep_c
(gdb)
134 ldr r0, =z_interrupt_stacks
(gdb)
Program received signal SIGTRAP, Trace/breakpoint trap.
(gdb)
Later, I used single-step debugging and found that Zephyr reached z_bss_zero
(gdb)
z_arm_floating_point_init ()
at C:/Users/zhihao3x/work/zephyrproject/modules/hal/cmsis/CMSIS/Core/Include/cmsis_gcc.h:1003
1003 __ASM volatile ("MSR control, %0" : : "r" (control) : "memory");
(gdb)
163 __set_CONTROL(__get_CONTROL() & (~(CONTROL_FPCA_Msk)));
(gdb)
0x6000305c in __set_CONTROL (control=3758157056)
at C:/Users/zhihao3x/work/zephyrproject/modules/hal/cmsis/CMSIS/Core/Include/cmsis_gcc.h:1003
1003 __ASM volatile ("MSR control, %0" : : "r" (control) : "memory");
(gdb)
1004 __ISB();
(gdb)
__ISB () at C:/Users/zhihao3x/work/zephyrproject/modules/hal/cmsis/CMSIS/Core/Include/cmsis_gcc.h:260
260 __ASM volatile ("isb 0xF":::"memory");
(gdb)
z_arm_prep_c () at C:/Users/zhihao3x/work/zephyrproject/zephyr/arch/arm/core/aarch32/prep_c.c:183
183 z_bss_zero();
(gdb) n
Program received signal SIGTRAP, Trace/breakpoint trap.
0x62652dc0 in ?? ()
Eventually I located the problem in the memset function,when this function is called, it will fall into an infinite loop and then my program will crash
z_arm_prep_c () at C:/Users/zhihao3x/work/zephyrproject/zephyr/arch/arm/core/aarch32/prep_c.c:183
183 z_bss_zero();
(gdb)
z_bss_zero () at C:/Users/zhihao3x/work/zephyrproject/zephyr/kernel/init.c:89
89 (void)memset(__bss_start, 0, __bss_end - __bss_start);
(gdb)
memset (buf=0x80000030 <z_idle_threads>, c=c#entry=0, n=428)
at C:/Users/zhihao3x/work/zephyrproject/zephyr/lib/libc/minimal/source/string/string.c:355
(gdb)
Program received signal SIGTRAP, Trace/breakpoint trap.
0x63f5fbf8 in ?? ()
Later I tried to debug the memset function and found that the dest address did not change during the loop
Very strange, I do not specify whether it is a gdb problem or a zephyr problem. The address is 0x80000030 before the increment, and 0x80000031 after the increment, and it changes back to 0x80000030 when the next cycle is repeated.
357 unsigned char c_byte = (unsigned char)c;
(gdb)
389 while (n > 0) {
(gdb)
390 *(d_byte++) = c_byte;
392 n--;
(gdb) p d_byte
$1 = (unsigned char *) 0x80000030 <z_idle_threads> "\b"
I tried to change the z_bss_zero function
(void)memset(__bss_start, 0, __bss_end - __bss_start);
Changed the size length
(void)memset(__bss_start, 0, 3);
But when I was debugging with gdb, I found that the while loop inside memset exceeded three times without stopping, and it entered an endless loop.
I speculate that there is a problem with the incremented code after these two lines of code, because when I use the gdb print command to print the value of the variable after each increment, the value of the variable does not change
*(d_byte++) = c_byte;
n--;
Another weird place is that when I use gdb to debug, I find that there is a pointer. Gdb will not break to this side, but skip this code and execute it. I doubt that the compiler is doing it to me. Is the code optimized?
More details
When I use next to execute z_bss_zero, gdb will break to an indeterminate address at 0xdeadbeee. I guess it may be that Zephyr destroyed the stack when initializing the memory.
(gdb)
z_arm_prep_c () at C:/Users/zhihao3x/work/zephyrproject/zephyr/arch/arm/core/aarch32/prep_c.c:183
183 z_bss_zero();
(gdb) n
Program received signal SIGTRAP, Trace/breakpoint trap.
0xdeadbeee in ?? ()
At the same time, the Jlink debugger also output an error log
ERROR: Cannot read register 15 (R15) while CPU is running
Reading all registers
ERROR: Cannot read register 0 (R0) while CPU is running
ERROR: Cannot read register 1 (R1) while CPU is running
ERROR: Cannot read register 2 (R2) while CPU is running
ERROR: Cannot read register 3 (R3) while CPU is running
ERROR: Cannot read register 4 (R4) while CPU is running
ERROR: Cannot read register 5 (R5) while CPU is running
ERROR: Cannot read register 6 (R6) while CPU is running
ERROR: Cannot read register 7 (R7) while CPU is running
ERROR: Cannot read register 8 (R8) while CPU is running
ERROR: Cannot read register 9 (R9) while CPU is running
ERROR: Cannot read register 10 (R10) while CPU is running
ERROR: Cannot read register 11 (R11) while CPU is running
ERROR: Cannot read register 12 (R12) while CPU is running
ERROR: Cannot read register 13 (R13) while CPU is running
ERROR: Cannot read register 14 (R14) while CPU is running
ERROR: Cannot read register 15 (R15) while CPU is running
ERROR: Cannot read register 16 (XPSR) while CPU is running
ERROR: Cannot read register 17 (MSP) while CPU is running
ERROR: Cannot read register 18 (PSP) while CPU is running
ERROR: Cannot read register 24 (PRIMASK) while CPU is running
ERROR: Cannot read register 25 (BASEPRI) while CPU is running
ERROR: Cannot read register 26 (FAULTMASK) while CPU is running
ERROR: Cannot read register 27 (CONTROL) while CPU is running
ERROR: Cannot read register 32 (FPSCR) while CPU is running
ERROR: Cannot read register 33 (FPS0) while CPU is running
ERROR: Cannot read register 34 (FPS1) while CPU is running
ERROR: Cannot read register 35 (FPS2) while CPU is running
ERROR: Cannot read register 36 (FPS3) while CPU is running
ERROR: Cannot read register 37 (FPS4) while CPU is running
ERROR: Cannot read register 38 (FPS5) while CPU is running
ERROR: Cannot read register 39 (FPS6) while CPU is running
ERROR: Cannot read register 40 (FPS7) while CPU is running
ERROR: Cannot read register 41 (FPS8) while CPU is running
ERROR: Cannot read register 42 (FPS9) while CPU is running
ERROR: Cannot read register 43 (FPS10) while CPU is running
ERROR: Cannot read register 44 (FPS11) while CPU is running
ERROR: Cannot read register 45 (FPS12) while CPU is running
ERROR: Cannot read register 46 (FPS13) while CPU is running
ERROR: Cannot read register 47 (FPS14) while CPU is running
ERROR: Cannot read register 48 (FPS15) while CPU is running
ERROR: Cannot read register 49 (FPS16) while CPU is running
ERROR: Cannot read register 50 (FPS17) while CPU is running
ERROR: Cannot read register 51 (FPS18) while CPU is running
ERROR: Cannot read register 52 (FPS19) while CPU is running
ERROR: Cannot read register 53 (FPS20) while CPU is running
ERROR: Cannot read register 54 (FPS21) while CPU is running
ERROR: Cannot read register 55 (FPS22) while CPU is running
ERROR: Cannot read register 56 (FPS23) while CPU is running
ERROR: Cannot read register 57 (FPS24) while CPU is running
ERROR: Cannot read register 58 (FPS25) while CPU is running
ERROR: Cannot read register 59 (FPS26) while CPU is running
ERROR: Cannot read register 60 (FPS27) while CPU is running
ERROR: Cannot read register 61 (FPS28) while CPU is running
ERROR: Cannot read register 62 (FPS29) while CPU is running
ERROR: Cannot read register 63 (FPS30) while CPU is running
ERROR: Cannot read register 64 (FPS31) while CPU is running
ERROR: Cannot read register 33 (FPS0) while CPU is running
ERROR: Cannot read register 34 (FPS1) while CPU is running
ERROR: Cannot read register 35 (FPS2) while CPU is running
ERROR: Cannot read register 36 (FPS3) while CPU is running
ERROR: Cannot read register 37 (FPS4) while CPU is running
ERROR: Cannot read register 38 (FPS5) while CPU is running
ERROR: Cannot read register 39 (FPS6) while CPU is running
ERROR: Cannot read register 40 (FPS7) while CPU is running
ERROR: Cannot read register 41 (FPS8) while CPU is running
ERROR: Cannot read register 42 (FPS9) while CPU is running
ERROR: Cannot read register 43 (FPS10) while CPU is running
ERROR: Cannot read register 44 (FPS11) while CPU is running
ERROR: Cannot read register 45 (FPS12) while CPU is running
ERROR: Cannot read register 46 (FPS13) while CPU is running
ERROR: Cannot read register 47 (FPS14) while CPU is running
ERROR: Cannot read register 48 (FPS15) while CPU is running
ERROR: Cannot read register 49 (FPS16) while CPU is running
ERROR: Cannot read register 50 (FPS17) while CPU is running
ERROR: Cannot read register 51 (FPS18) while CPU is running
ERROR: Cannot read register 52 (FPS19) while CPU is running
ERROR: Cannot read register 53 (FPS20) while CPU is running
ERROR: Cannot read register 54 (FPS21) while CPU is running
ERROR: Cannot read register 55 (FPS22) while CPU is running
ERROR: Cannot read register 56 (FPS23) while CPU is running
ERROR: Cannot read register 57 (FPS24) while CPU is running
ERROR: Cannot read register 58 (FPS25) while CPU is running
ERROR: Cannot read register 59 (FPS26) while CPU is running
ERROR: Cannot read register 60 (FPS27) while CPU is running
ERROR: Cannot read register 61 (FPS28) while CPU is running
ERROR: Cannot read register 62 (FPS29) while CPU is running
ERROR: Cannot read register 63 (FPS30) while CPU is running
ERROR: Cannot read register 64 (FPS31) while CPU is running
Removing breakpoint # address 0x60003068, Size = 2
WARNING: Failed to read memory # address 0xDEADBEEE
Jlink output
The following is the MCU information output by Jlink, I am not sure if this version is supported by Zephyr
-----GDB Server start settings-----
GDBInit file: none
GDB Server Listening port: 2331
SWO raw output listening port: 2332
Terminal I/O port: 2333
Accept remote connection: localhost only
Generate logfile: off
Verify download: off
Init regs on start: off
Silent mode: on
Single run mode: on
Target connection timeout: 5000 ms
------J-Link related settings------
J-Link Host interface: USB
J-Link script: none
J-Link settings file: none
------Target related settings------
Target device: MIMXRT1062xxx6A
Target interface: SWD
Target interface speed: auto
Target endian: little
I have tried hard for seven days and still can’t solve this problem. I tried to switch the version of Zephyr and also used platform IO to generate elf and bin files, but they were all ineffective. Please help me. Thank you very much.

Debugging a nasty SIGILL crash: Text Segment corruption

Ours is a PowerPC based embedded system running Linux. We are encountering a random SIGILL crash which is seen for wide variety of applications. The root-cause for the crash is zeroing out of the instruction to be executed. This indicates corruption of the text segment residing in memory. As the text segment is loaded read-only, the application cannot corrupt it. So I am suspecting some common sub-system (DMA?) causing this corruption. Since the problem takes days to reproduce (crash due to SIGILL) it is getting difficult to investigate. So to begin with I want to be able to know if and when the text segment of any application has been corrupted.
I have looked at the stack trace and all the pointers, registers are proper.
Do you guys have any suggestions how I can go about it?
Some Info:
Linux 3.12.19-rt30 #1 SMP Fri Mar 11 01:31:24 IST 2016 ppc64 GNU/Linux
(gdb) bt
0 0x10457dc0 in xxx
Disassembly output:
=> 0x10457dc0 <+80>: mr r1,r11
0x10457dc4 <+84>: blr
Instruction expected at address 0x10457dc0: 0x7d615b78
Instruction found after catching SIGILL 0x10457dc0: 0x00000000
(gdb) maintenance info sections
0x10006c60->0x106cecac at 0x00006c60: .text ALLOC LOAD READONLY CODE HAS_CONTENTS
Expected (from the application binary):
(gdb) x /32 0x10457da0
0x10457da0 : 0x913e0000 0x4bff4f5d 0x397f0020 0x800b0004
0x10457db0 : 0x83abfff4 0x83cbfff8 0x7c0803a6 0x83ebfffc
0x10457dc0 : 0x7d615b78 0x4e800020 0x7c7d1b78 0x7fc3f378
0x10457dd0 : 0x4bcd8be5 0x7fa3eb78 0x4857e109 0x9421fff0
Actual (after handling SIGILL and dumping nearby memory locations):
Faulting instruction address: 0x10457dc0
0x10457da0 : 0x913E0000
0x10457db0 : 0x83ABFFF4
=> 0x10457dc0 : 0x00000000
0x10457dd0 : 0x4BCD8BE5
0x10457de0 : 0x93E1000C
Edit:
One lead that we have is that the corruption is always occurring at an offset that ends with 0xdc0.
For e.g.
Faulting instruction address: 0x10653dc0 << printed by our application after catching SIGILL
Faulting instruction address: 0x1000ddc0 << printed by our application after catching SIGILL
flash_erase[8557]: unhandled signal 4 at 0fed6dc0 nip 0fed6dc0 lr 0fed6dac code 30001
nandwrite[8561]: unhandled signal 4 at 0fed6dc0 nip 0fed6dc0 lr 0fed6dac code 30001
awk[4448]: unhandled signal 4 at 0fe09dc0 nip 0fe09dc0 lr 0fe09dbc code 30001
awk[16002]: unhandled signal 4 at 0fe09dc0 nip 0fe09dc0 lr 0fe09dbc code 30001
getStats[20670]: unhandled signal 4 at 0fecfdc0 nip 0fecfdc0 lr 0fecfdbc code 30001
expr[27923]: unhandled signal 4 at 0fe74dc0 nip 0fe74dc0 lr 0fe74dc0 code 30001
Edit 2: Another lead is that the corruption is always occurring at physical frame number 0x00a4d. I suppose with PAGE_SIZE of 4096 this translates to physical address of 0x00A4DDC0. We are suspecting couple of our kernel drivers and investigating further. Is there any better idea (like putting hardware watchpoint) which could be more efficient? How about KASAN as suggested below?
Any help is appreciated. Thanks.
1.) Text segment is RO, but the permissions could be changed by mprotect, you can check that if you think it is possible
2.) If it is kernel problem:
Run kernel with KASAN and KUBSAN (undefined behaviour) sanitizers
Focus on drivers code not included in mainline
The hint here is one byte corruption. Maybe i'm wrong, but it means that DMA is not to blame. It looks like some kind of invalid store.
3.) Hardware. I think, your problem looks like a hardware problem (RAM issue).
You can try to decrease RAM system frequency in bootloader
Check if this problem reproduces on stable mainline software, that is how you can prove that it's it

gdb - identify the reason of the segmentation fault

I have a server/client system that runs well on my machines. But it core dumps at one of the users machine (OS: Centos 5). Since I don't have access to the user's machine so I built a debug mode binary and asked the user to try it. The crash did happened again after around 2 days of running. And he sent me the core dump file. Loading the core dump file with gdb, it did shows the crash location but I don't understand the reason (sorry, my previous experience is mostly with Windows. I don't have much experience with Linux/gdb). I would like have your input. Thanks!
1. the /var/log/messages at the user's machine shows the segfault:
Jan 16 09:20:39 LPZ08945 kernel: LSystem[4688]: segfault at 0000000000000000 rip 00000000080e6433 rsp 00000000f2afd4e0 error 4
This message indicates that there is a segfault at instruction pointer 80e6433 and stack pointer f2afd4e0. Looks that the program tries to read/write at address 0.
2. load the core dump file into gdb and it shows the crash location:
$gdb LSystem core.19009
GNU gdb (GDB) CentOS (7.0.1-45.el5.centos)
... (many lines of outputs from gdb omitted)
Core was generated by `./LSystem'.
Program terminated with signal 11,
Segmentation fault.
'#0' 0x080e6433 in CLClient::connectToServer (this=0xf2afd898, conn=11) at liccomm/LClient.cpp:214
214 memcpy((char *) & (a4.sin_addr), pHost->h_addr, pHost->h_length);
gdb says the crash occurs at Line 214?
3. Frame information. (at Frame #0)
(gdb) info frame
Stack level 0, frame at 0xf2afd7e0:
eip = 0x80e6433 in CLClient::connectToServer (liccomm/LClient.cpp:214); saved eip 0x80e6701
called by frame at 0xf2afd820
source language c++.
Arglist at 0xf2afd7d8, args: this=0xf2afd898, conn=11
Locals at 0xf2afd7d8, Previous frame's sp is 0xf2afd7e0
Saved registers:
ebx at 0xf2afd7cc, ebp at 0xf2afd7d8, esi at 0xf2afd7d0, edi at 0xf2afd7d4, eip at 0xf2afd7dc
The frame is at f2afd7e0, why it's different than the rsp from Part 1, which is f2afd4e0? I guess the user may have provided me with mismatched core dump file (whose pid is 19009) and /var/log/messages file (which indicates a pid 4688).
4. The source
(gdb) list +
209
210 //pHost is declared as struct hostent* and 'pHost = gethostbyname(serverAddress);'
211 memset( &a4, 0, sizeof(a4) );
212 a4.sin_family = AF_INET;
213 a4.sin_port = htons( nPort );
214 memcpy((char *) & (a4.sin_addr), pHost->h_addr, pHost->h_length);
215
216 aalen = sizeof(a4);
217 aa = (struct sockaddr *)&a4;
I could not see anything wrong with Line 214. And this part of the code must ran many times during the runtime of 2 days.
5. The variables
Since gdb indicated that Line 214 was the culprit. I printed everything.
memcpy((char *) & (a4.sin_addr), pHost->h_addr, pHost->h_length);
(gdb) print a4.sin_addr
$1 = {s_addr = 0}
(gdb) print &(a4.sin_addr)
$2 = (in_addr *) 0xf2afd794
(gdb) print pHost->h_addr_list[0]
$3 = 0xa24af30 "\202}\204\250"
(gdb) print pHost->h_length
$4 = 4
(gdb) print memcpy
$5 = {} 0x2fcf90
So I basically printed everything that's at Line 214. ('pHost->h_addr_list[0]' is 'pHost->h_addr' due to '#define h_addr h_addr_list[0]')
I was not able to catch anything wrong. Did you catch anything fishy? Is it possible the memory has been corrupted somewhere else? I appreciate your help!
[edited] 6. back trace
(gdb) bt
'#0' 0x080e6433 in CLClient::connectToServer (this=0xf2afd898, conn=11) at liccomm/LClient.cpp:214
'#1' 0x080e6701 in CLClient::connectToLMServer (this=0xf2afd898) at liccomm/LClient.cpp:121
... (Frames 2~7 omitted, not relevant)
'#8' 0x080937f2 in handleConnectionStarter (par=0xf3563f98) at LManager.cpp:166
'#9' 0xf7f5fb41 in ?? ()
'#10' 0xf3563f98 in ?? ()
'#11' 0xf2aff31c in ?? ()
'#12' 0x00000000 in ?? ()
I followed the nested calls. They are correct.
The problem with the memcpy is that the source location is not of the same type than the destination.
You should use inet_addr to convert addresses from string to binary
a4.sin_addr = inet_addr(pHost->h_addr);
The previous code may not work depending on the implementation (some my return struct in_addr, others will return unsigned long, but the principle is the same.

Assembly for MSP430 doesn't work

I'm currently learning assembly with MSP430.
This simple code should trigger an interrupt for TimerA.
#include "msp430.h" ; #define controlled include file
RSEG CSTACK
RSEG CODE
Reset:
mov.w #WDTPW|WDTHOLD, &WDTCTL ; stop watchdog
mov.w #MC_0|TACLR, &TACTL; stop timer
mov.w #7D0h, &TACCR0; count to 2000
mov.w #CCIE ,&TACCTL0
mov.w #MC_1|TAIE|TASSEL_2, &TACTL; RESET timerA, start it and set clock to VCO
mov.b #BIT0, &P2DIR; output
mov.b #BIT0, &P2OUT;
Main:
mov.w #GIE, SR;
jmp Main;
INTER:
xor.b #BIT0, &P1OUT ; toggle LED
reti
RSEG RESET
DW Reset;
ORG 0xFFF2; address of interrupt
DW INTER
END
But get following error after trying to load code on my MSP:
Error[e104]: Failed to fit all segments into specified ranges. Problem
discovered in segment CODE. Unable to place 1 block(s) (0xfff4 byte(s) total)
in 0x7e0 byte(s) of memory.
The problem occurred while processing the segment placement command
"-P(CODE)CODE=F800-FFDF", where at the moment of placement the
available memory ranges were "CODE:f800-ffdf"
Error while running Linker
I am sure it is some noob error, but can you explain what it is?

Why would cortex-m3 reset to address 0 in gdb?

I am building a cross-compile toolchain for the Stellaris LM3S8962 cortex-m3 chip. The test c++ application I have written will execute for some time then fault. The fault will occur when I try to access a memory-mapped hardware device. At the moment my working hypothesis is that I am missing some essential chip initialization in my startup sequence.
What I would like to understand is why would the execution in gdb get halted and the program counter be set to 0? I have the vector table at 0x0, but the first value is the stack pointer. Shouldn't I end up in one of the fault handlers I specify in the vector table?
(gdb)
187 UARTSend((unsigned char *)secret, 2);
(gdb) cont
Continuing.
lm3s.cpu -- clearing lockup after double fault
Program received signal SIGINT, Interrupt.
0x00000000 in g_pfnVectors ()
(gdb) info registers
r0 0x1 1
r1 0x32 50
r2 0xffffffff 4294967295
r3 0x0 0
r4 0x74518808 1951500296
r5 0xc24c0551 3259762001
r6 0x42052dac 1107635628
r7 0x20007230 536900144
r8 0xf85444a9 4166272169
r9 0xc450591b 3293600027
r10 0xd8812546 3632342342
r11 0xb8420815 3091335189
r12 0x3 3
sp 0x200071f0 0x200071f0
lr 0xfffffff1 4294967281
pc 0x1 0x1 <g_pfnVectors+1>
fps 0x0 0
cpsr 0x60000023 1610612771
The toolchain is based on gcc, gdb, openocd.
GDB happily gave you some clue:
clearing lockup after double fault
Your CPU was in locked state. That means it could not run its "Hard Fault" Interrupt Handler (maybe there is a 0 in its Vector).
I usually get these when I forgot to "power" the periperial, the resulting Bus Error escalates first to "Hard Fault" and then to locked state. Should be mentioned in the manual of your MCU, btw.