I'm programming bare-metal embedded, so no OS etc. on a STM32L4 (ARM Cortex M4). I have a separate page in flash, which is written by a bootloader (it is not and should not be part of my application binary, this is a must). In this page, I store configuration parameters that will be used in my application. This configuration page may change, but not during runtime, after a change I reset the processor.
How can I access this data in flash most nicely?
My definition of nice is (in this order of priority):
- support for (u)int32_t, (u)int8_t, bool, char[fixed-size]
- little overhead when compared to #define PARAM (1) or constexpr
- typesafe usage (i.e. uint8_t var = CONFIG_CHAR_ARRAY shall issue atleast a warning)
- no RAM copy
- readability of the configuration parameters while debugging (using STM32CubeIDE)
The solution shall scale for all possible 2048 bytes of the flashpage. Code generation is anyhow part of the process.
So far, I have tested two variants (I am coming from plain C but am using (potentially modern) C++ in this project). My current testcase is
if (param) function_call();
but it should also work for other cases such as
for(int i = 0; i < param2; i++)
define with pointer cast
#define CONF_PARAM1 (*(bool*)(CONFIG_ADDRESS + 0x0083))
Which leads to (using -Os):
8008872: 4b1b ldr r3, [pc, #108] ; (80088e0 <_Z16main_applicationv+0xac>)
8008874: 781b ldrb r3, [r3, #0]
8008876: b10b cbz r3, 800887c <_Z16main_applicationv+0x48>
8008878: f7ff ff56 bl 8008728 <_Z10function_callv>
80088e0: 0801f883 .word 0x0801f883
const variable
const bool CONF_PARAM1 = *(bool*)(CONFIG_ADDRESS + 0x0083);
leading to
800887c: 4b19 ldr r3, [pc, #100] ; (80088e4 <_Z16main_applicationv+0xb0>)
800887e: 781b ldrb r3, [r3, #0]
8008880: b10b cbz r3, 8008886 <_Z16main_applicationv+0x52>
8008882: f000 f899 bl 8008728 <_Z10function_callv>
80088e4: 200000c0 .word 0x200000c0
I dislike option 2, as it adds a RAM copy (would not scale well for 2048 bytes of config), option 1 looks like very old c style and does not help while debugging. I struggle to find another option using the linker script, as I do not find a way to not end up with the variable being in the application's binary.
Is there any better way of doing this?
If you make your constant a reference the compiler wont copy it into a variable, it will probably just load the address into a variable. You can then wrap the generation of the references into a templated function to make your code cleaner:
#include <cstdint>
#include <iostream>
template <typename T>
const T& configOption(uintptr_t offset)
{
const uintptr_t CONFIG_ADDRESS = 0x1000;
return *reinterpret_cast<T*>(CONFIG_ADDRESS + offset);
}
auto& CONF_PARAM1 = configOption< bool >(0x0083);
auto& CONF_PARAM2 = configOption< int >(0x0087);
int main()
{
std::cout << CONF_PARAM1 << ", " << CONF_PARAM2 << "\n";
}
GCC optimises this fairly well: https://godbolt.org/z/r27o5Q
As proposed by #old_timer in a comment above, I favour this solution:
In the linker file, I put
CONF_PARAM = _config_start + 0x0083;
In my config.hpp, I put
extern const bool CONF_PARAM;
which then can easily be accessed in any source file
if (CONF_PARAM)
This basically fulfills all "nice"-definitions of mine, as far as I can see.
No need to re-invent the wheel - placing data in flash is a fairly common use-case in embedded systems. When dealing with such data flash, there are some important considerations:
All data must sit at the very same address, with the same type, from case to case. This means that struct is problematic because of padding (and even more so class). If you align all data on 32 bit boundaries, this shouldn't be a problem, so I strongly recommend that you do so. Then the program becomes portable between compilers.
All these variables and pointers to them must be declared with volatile qualifier, otherwise the optimizer might go haywire. Things like (*(bool*)(CONFIG_ADDRESS + 0x0083)) are brittle and might break at any point, unless you add volatile.
You can place data at a fixed location in memory, but how to do so is compiler/linker-specific. And since it isn't standardized, it's always a pain to get right. With gcc-flavoured compilers it might be something like: __attribute__(section(".dataflash")) where .dataflash is your custom segment that you must reserve space for in the linker script. You'll need to take a closer look at how to do this with your specific toolchain (others use #pragmas etc instead), I'll use the __attribute__ here just to illustrate.
If this section gets downloaded together with the executable binary, or only through bootloader, is up to you to define. Linker scripts typically come with a "no init" option.
So you could do something like:
// flashdata.h
typedef struct
{
uint32_t stuff;
uint32_t more_stuff;
...
} flashdata_t;
extern volatile const flashdata_t flash_data __attribute__(section(".dataflash"));
And then declare it as:
// flashdata.c
volatile const flashdata_t flash_data __attribute__(section(".dataflash"));
And now you can use it as any struct, flash_data.stuff.
If you are using C, you can even split up each uint32_t chunk with union, such as typedef union { uint32_t u32; uint8_t u8 [4]; } and similar, but that isn't possible in C++, because it doesn't allow union type punning.
You can isolate the variables in question their own section. There is more than one way to do that. The tools build normally and do all the addressing work. Like using structs across compile domains you need to be extremely careful and probably put checks into the code, but you can build the binary and only load it or all but the other flash contents, then at that time or later you can change the VALUES of the variables in the other section and build and isolate those into their own load.
Testing the theory
vectors.s
.globl _start
_start:
.word 0x20001000
.word reset
.thumb_func
reset:
bl main
b .
.globl dummy
.thumb_func
dummy:
bx lr
so.c
extern volatile unsigned int x;
extern volatile unsigned short y;
extern volatile unsigned char z[7];
extern void dummy ( unsigned int );
int main ( void )
{
dummy(x);
dummy(y);
dummy(z[0]<<z[1]);
return(0);
}
flashvars.c
volatile unsigned int x=1;
volatile unsigned short y=3;
volatile unsigned char z[7]={1,2,3,4,5,6,7};
flash.ld
MEMORY
{
rom0 : ORIGIN = 0x08000000, LENGTH = 0x1000
rom1 : ORIGIN = 0x08002000, LENGTH = 0x1000
ram : ORIGIN = 0x20000000, LENGTH = 0x1000
}
SECTIONS
{
.text : { *(.text*) } > rom0
.vars : { flashvars.o } > rom1
}
build
arm-none-eabi-as --warn --fatal-warnings -mcpu=cortex-m0 vectors.s -o vectors.o
arm-none-eabi-ld -nostdlib -nostartfiles -T flash.ld vectors.o so.o flashvars.o -o so.elf
arm-none-eabi-objdump -D so.elf > so.list
arm-none-eabi-objcopy -R .vars -O binary so.elf so.bin
examine
Disassembly of section .text:
08000000 <_start>:
8000000: 20001000 andcs r1, r0, r0
8000004: 08000009 stmdaeq r0, {r0, r3}
08000008 <reset>:
8000008: f000 f802 bl 8000010 <main>
800000c: e7fe b.n 800000c <reset+0x4>
0800000e <dummy>:
800000e: 4770 bx lr
08000010 <main>:
8000010: 4b08 ldr r3, [pc, #32] ; (8000034 <main+0x24>)
8000012: b510 push {r4, lr}
8000014: 6818 ldr r0, [r3, #0]
8000016: f7ff fffa bl 800000e <dummy>
800001a: 4b07 ldr r3, [pc, #28] ; (8000038 <main+0x28>)
800001c: 8818 ldrh r0, [r3, #0]
800001e: b280 uxth r0, r0
8000020: f7ff fff5 bl 800000e <dummy>
8000024: 4b05 ldr r3, [pc, #20] ; (800003c <main+0x2c>)
8000026: 7818 ldrb r0, [r3, #0]
8000028: 785b ldrb r3, [r3, #1]
800002a: 4098 lsls r0, r3
800002c: f7ff ffef bl 800000e <dummy>
8000030: 2000 movs r0, #0
8000032: bd10 pop {r4, pc}
8000034: 0800200c stmdaeq r0, {r2, r3, sp}
8000038: 08002008 stmdaeq r0, {r3, sp}
800003c: 08002000 stmdaeq r0, {sp}
Disassembly of section .vars:
08002000 <z>:
8002000: 04030201 streq r0, [r3], #-513 ; 0xfffffdff
8002004: 00070605 andeq r0, r7, r5, lsl #12
08002008 <y>:
8002008: 00000003 andeq r0, r0, r3
0800200c <x>:
800200c: 00000001 andeq r0, r0, r1
that looks good
hexdump -C so.bin
00000000 00 10 00 20 09 00 00 08 00 f0 02 f8 fe e7 70 47 |... ..........pG|
00000010 08 4b 10 b5 18 68 ff f7 fa ff 07 4b 18 88 80 b2 |.K...h.....K....|
00000020 ff f7 f5 ff 05 4b 18 78 5b 78 98 40 ff f7 ef ff |.....K.x[x.#....|
00000030 00 20 10 bd 0c 20 00 08 08 20 00 08 00 20 00 08 |. ... ... ... ..|
00000040
as does that.
arm-none-eabi-objcopy -j .vars -O binary so.elf sovars.bin
hexdump -C sovars.bin
00000000 01 02 03 04 05 06 07 00 03 00 00 00 01 00 00 00 |................|
00000010 47 43 43 3a 20 28 47 4e 55 29 20 39 2e 33 2e 30 |GCC: (GNU) 9.3.0|
00000020 00 41 30 00 00 00 61 65 61 62 69 00 01 26 00 00 |.A0...aeabi..&..|
00000030 00 05 43 6f 72 74 65 78 2d 4d 30 00 06 0c 07 4d |..Cortex-M0....M|
00000040 09 01 12 04 14 01 15 01 17 03 18 01 19 01 1a 01 |................|
00000050 1e 02 |..|
00000052
hah, okay a little more work.
MEMORY
{
rom0 : ORIGIN = 0x08000000, LENGTH = 0x1000
rom1 : ORIGIN = 0x08002000, LENGTH = 0x1000
ram : ORIGIN = 0x20000000, LENGTH = 0x1000
}
SECTIONS
{
.text : { *(.text*) } > rom0
.vars : { flashvars.o(.data) } > rom1
}
hexdump -C sovars.bin
00000000 01 02 03 04 05 06 07 00 03 00 00 00 01 00 00 00 |................|
00000010
much better.
I strongly recommend against structs across compile domains and this falls into that category as the build for the real data is separate and between the code build and the data build you could get data that doesn't land the same, when I do things like this I put in protections to catch the problem during execution before it goes off the rails (or better at build time). It is not a case of if it is a case of when. Implementation defined means implementation defined.
But thinking about your question this became an easy solution. And yes technically this data is read only, const this or that, but 1) does volatile and const go together? and 2) do you really want/need to do that?
Does it even need to be volatile? Probably not, just banged that out to start with. Switching it to const the tool puts them in .rodata. Well my tool does depends on how you write your linker script and I think the version of binutils.
so.c
extern const unsigned int x;
extern const unsigned short y;
extern const unsigned char z[7];
extern void dummy ( unsigned int );
int main ( void )
{
dummy(x);
dummy(y);
dummy(z[0]<<z[1]);
return(0);
}
flashvars.c
const unsigned int x=1;
const unsigned short y=3;
const unsigned char z[7]={1,2,3,4,5,6,7};
flash.ld
MEMORY
{
rom0 : ORIGIN = 0x08000000, LENGTH = 0x1000
rom1 : ORIGIN = 0x08002000, LENGTH = 0x1000
ram : ORIGIN = 0x20000000, LENGTH = 0x1000
}
SECTIONS
{
.text : { *(.text*) } > rom0
.vars : { flashvars.o(.rodata) } > rom1
}
output
Disassembly of section .text:
08000000 <_start>:
8000000: 20001000 andcs r1, r0, r0
8000004: 08000009 stmdaeq r0, {r0, r3}
08000008 <reset>:
8000008: f000 f802 bl 8000010 <main>
800000c: e7fe b.n 800000c <reset+0x4>
0800000e <dummy>:
800000e: 4770 bx lr
08000010 <main>:
8000010: 4b08 ldr r3, [pc, #32] ; (8000034 <main+0x24>)
8000012: b510 push {r4, lr}
8000014: 6818 ldr r0, [r3, #0]
8000016: f7ff fffa bl 800000e <dummy>
800001a: 4b07 ldr r3, [pc, #28] ; (8000038 <main+0x28>)
800001c: 8818 ldrh r0, [r3, #0]
800001e: f7ff fff6 bl 800000e <dummy>
8000022: 4b06 ldr r3, [pc, #24] ; (800003c <main+0x2c>)
8000024: 7818 ldrb r0, [r3, #0]
8000026: 785b ldrb r3, [r3, #1]
8000028: 4098 lsls r0, r3
800002a: f7ff fff0 bl 800000e <dummy>
800002e: 2000 movs r0, #0
8000030: bd10 pop {r4, pc}
8000032: 46c0 nop ; (mov r8, r8)
8000034: 0800200c stmdaeq r0, {r2, r3, sp}
8000038: 08002008 stmdaeq r0, {r3, sp}
800003c: 08002000 stmdaeq r0, {sp}
Disassembly of section .vars:
08002000 <z>:
8002000: 04030201 streq r0, [r3], #-513 ; 0xfffffdff
8002004: 00070605 andeq r0, r7, r5, lsl #12
08002008 <y>:
8002008: 00000003 andeq r0, r0, r3
0800200c <x>:
800200c: 00000001 andeq r0, r0, r1
hexdump -C so.bin
00000000 00 10 00 20 09 00 00 08 00 f0 02 f8 fe e7 70 47 |... ..........pG|
00000010 08 4b 10 b5 18 68 ff f7 fa ff 07 4b 18 88 ff f7 |.K...h.....K....|
00000020 f6 ff 06 4b 18 78 5b 78 98 40 ff f7 f0 ff 00 20 |...K.x[x.#..... |
00000030 10 bd c0 46 0c 20 00 08 08 20 00 08 00 20 00 08 |...F. ... ... ..|
00000040
hexdump -C sovars.bin
00000000 01 02 03 04 05 06 07 00 03 00 00 00 01 00 00 00 |................|
00000010
Related
I am trying to compile my code on CCS(Code composer studio) using TI ARM CLANG compiler.
I am trying to implement Ethernet fucntionality which uses TI's enet SDK
I call a fucntion in my main which is in the enet SDK but the comiler is throwing error
unresolved symbol Enet_initOsalCfg(EnetOsal_Cfg_s, first referenced in *)
I have added its library in the linker tab
To confirm i am not doing something stuipd I use same compilers objdump to disassemble the library and I think if I am not mistaken the dump clearly shows the symbol is present.
Function I called in main() has the declaration:
void Enet_initOsalCfg(EnetOsal_Cfg *osalCfg);
Following is a snippet from the objdump having same name as my function:
Disassembly of section .text.Enet_initOsalCfg:
00000000 <Enet_initOsalCfg>:
0: 00 48 2d e9 push {r11, lr}
4: 08 d0 4d e2 sub sp, sp, #8
8: 04 00 8d e5 str r0, [sp, #4]
c: 04 00 9d e5 ldr r0, [sp, #4]
10: fe ff ff eb bl #-8 <Enet_initOsalCfg+0x10>
14: 08 d0 8d e2 add sp, sp, #8
18: 00 88 bd e8 pop {r11, pc}
Disassembly of section .rel.text.Enet_initOsalCfg:
00000000 <.rel.text.Enet_initOsalCfg>:
0: 10 00 00 00 andeq r0, r0, r0, lsl r0
4: 1c 9c 00 00 andeq r9, r0, r12, lsl r12
Disassembly of section .ARM.exidx.text.Enet_initOsalCfg:
00000000 <.ARM.exidx.text.Enet_initOsalCfg>:
0: 00 00 00 00 andeq r0, r0, r0
4: 01 00 00 00 andeq r0, r0, r1
Disassembly of section .rel.ARM.exidx.text.Enet_initOsalCfg:
00000000 <.rel.ARM.exidx.text.Enet_initOsalCfg>:
0: 00 00 00 00 andeq r0, r0, r0
4: 2a 72 00 00 andeq r7, r0, r10, lsr #4
What am I missing here?
Excuse me if I am being stupid
Big oops our friend in the comments found my problem. I forgot extern "C" sorry for being stupid I was scratching my head on this since 4 hours, my aplologies :P
I am having difficulty understanding how this segmentation fault is possible. The architecture of the machine is armv7l.
The core dump:
Dump of assembler code for function DLL_Disconnect:
0x6cd3a460 <+0>: 15 4b ldr r3, [pc, #84] ; (0x6cd3a4b8 <DLL_Disconnect+88>)
0x6cd3a462 <+2>: 00 21 movs r1, #0
0x6cd3a464 <+4>: 15 4a ldr r2, [pc, #84] ; (0x6cd3a4bc <DLL_Disconnect+92>)
0x6cd3a466 <+6>: 30 b5 push {r4, r5, lr}
0x6cd3a468 <+8>: 83 b0 sub sp, #12
0x6cd3a46a <+10>: 7b 44 add r3, pc
0x6cd3a46c <+12>: 01 91 str r1, [sp, #4]
0x6cd3a46e <+14>: 04 46 mov r4, r0
0x6cd3a470 <+16>: 9d 58 ldr r5, [r3, r2]
=> 0x6cd3a472 <+18>: 28 68 ldr r0, [r5, #0]
0x6cd3a474 <+20>: c0 b1 cbz r0, 0x6cd3a4a8 <DLL_Disconnect+72>
0x6cd3a476 <+22>: 21 46 mov r1, r4
...
0x6cd3a4b6 <+86>: 00 bf nop
0x6cd3a4b8 <+88>: 96 b6 00 00 .word 0x0000b696 <- replaced from objdump, as gdb prints as instruction
0x6cd3a4bc <+92>: 1c 02 00 00 .word 0x0000021c <- also replaced
The registers:
r0 0x0 0
r1 0x0 0
r2 0x21c 540
r3 0x6cd45b04 1825856260
r4 0x0 0
r5 0x1dddc 122332
...
sp 0x62afeb40 0x62afeb40
lr 0x72a3091b 1923287323
pc 0x6cd3a472 0x6cd3a472 <DLL_Disconnect+18>
cpsr 0x600c0030 1611399216
fpscr 0x0 0
The segmentation fault is caused when "ldr r0, [r5, #0]" tries to access the memory address pointed to by r5. In GDB I get a similar message when trying to access it in GDB:
(gdb) print *$r5
Cannot access memory at address 0x1dddc
However, all offending register values are calculated by static values. So I don't understand how the memory address is not accessible.
The source code is loaded and executed via a shared library using dlopen and dlsym:
CClient* gl_pClient = NULL;
extern "C" unsigned long DLL_Disconnect(unsigned long ulHandle)
{
CProtocol* pCProtocol = NULL;
unsigned long ulResult = ACTION_INTERNAL_ERROR;
if (gl_pClient == NULL)
{
return ACTION_API_NOT_INITIALIZED;
}
...
The assembly code resolves the address of global variable gl_pClient using dll relocations, which are loaded using program-counter-relative addressing. Then the code loads from that address and crashes. It looks like the relocations got corrupted, so that the resolved address is invalid.
There isn't much else can be said without a reproduction.
You may like to run your program under valgrind which may report memory corruption.
I have written the following function:
inline void putc(int c)
{
static Cons_serial serial;
if (serial.enabled())
serial.putc(c);
}
Where Cons_serial is a class with a non-trivial default constructor. I believe the exact class definition is not important here but you can correct me on that. I'm compiling for x86 32 bit with g++ using the following flags: -m32 -fno-PIC -ffreestanding -fno-rtti -fno-exceptions -fno-threadsafe-statics -O0, the generated assembly code for putc looks like this:
00100221 <_Z4putci>:
100221: 55 push %ebp
100222: 89 e5 mov %esp,%ebp
100224: 83 ec 08 sub $0x8,%esp
100227: b8 f8 02 10 00 mov $0x1002f8,%eax
10022c: 0f b6 00 movzbl (%eax),%eax
10022f: 84 c0 test %al,%al
100231: 75 18 jne 10024b <_Z4putci+0x2a>
100233: 83 ec 0c sub $0xc,%esp
100236: 68 f0 02 10 00 push $0x1002f0
10023b: e8 36 fe ff ff call 100076 <_ZN11Cons_serialC1Ev>
100240: 83 c4 10 add $0x10,%esp
100243: b8 f8 02 10 00 mov $0x1002f8,%eax
100248: c6 00 01 movb $0x1,(%eax)
10024b: 83 ec 0c sub $0xc,%esp
10024e: 68 f0 02 10 00 push $0x1002f0
100253: e8 4c fe ff ff call 1000a4 <_ZNK11Cons_serial7enabledEv>
100258: 83 c4 10 add $0x10,%esp
10025b: 84 c0 test %al,%al
10025d: 74 13 je 100272 <_Z4putci+0x51>
10025f: 83 ec 08 sub $0x8,%esp
100262: ff 75 08 pushl 0x8(%ebp)
100265: 68 f0 02 10 00 push $0x1002f0
10026a: e8 41 fe ff ff call 1000b0 <_ZN11Cons_serial4putcEi>
10026f: 83 c4 10 add $0x10,%esp
100272: 90 nop
100273: c9 leave
100274: c3 ret
During execution, the jump at 100231 is taken the first time the function runs, thus Cons_serial is never called. Why knowledge of x86 assembly is questionable, what do the instructions leading up to that one actually do? I assume the code is meant to skip the constructor call on subsequent function calls. But then why is it skipped the first time the function runs as well?
EDIT: This code is part of a kernel I'm writing and I suspect the root cause might be an issue with my kernel's .bss section, here is the linker script I use:
OUTPUT_FORMAT("elf32-i386")
ENTRY(_start)
SECTIONS
{
. = 0x100000;
.text : AT(0x100000) {
*(.text)
}
.data : SUBALIGN(2) {
*(.data);
*(.rodata*);
}
.bss : SUBALIGN(4) {
__bss_start = .;
*(.COMMON);
*(.bss*)
. = ALIGN(4);
__bss_end = .;
}
/DISCARD/ : {
*(.eh_frame)
*(.comment)
}
}
And here's the code I use to zero the .bss section:
extern uint32_t __bss_start;
extern uint32_t __bss_end;
void zero_bss()
{
for (uint32_t bss_addr = __bss_start; bss_addr < __bss_end; ++bss_addr)
*reinterpret_cast<uint8_t *>(bss_addr) = 0x00;
}
But when zero_bss runs, __bss_start is 0x27 and __bss_end is 0x101 which is not at all what I'd except (the BSS should encompass address 0x1002f8 after all).
I've solved it now, the hint from #user3124812 was what got me there, thanks again.
My zero_bss code was faulty, I needed to take the addresses of the __bss* markers from the linker script, i.e.:
extern uint8_t __bss_start;
extern uint8_t __bss_end;
void zero_bss()
{
uint8_t *bss_start = reinterpret_cast<uint8_t *>(&__bss_start);
uint8_t *bss_end = reinterpret_cast<uint8_t *>(&__bss_end);
for (uint8_t *bss_addr = bss_start; bss_addr < bss_end; ++bss_addr)
*bss_addr = 0x00;
}
Now everything works.
I've wrote to file some assembly instructions and I would like to make them executable. However, I'm messing up something with the program headers. I've read the whole man page about the ELF header, but I didn't understand much.
#include <elf.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/stat.h>
void MakeExecutable(char *codeBuffer, char *startAddr, unsigned int codeSize){
char *fileBuffer = (char*) malloc(2058);
memset(fileBuffer, 0, 2058);
//I've omitted the sections declaration
unsigned int codeOffset = sizeof(Elf64_Ehdr) + (unsigned int ) (startAddr - codeBuffer);
Elf64_Ehdr *h = (Elf64_Ehdr*) &fileBuffer[0];
h->e_ident[0] = 0x7f;
h->e_ident[1] = 0x45;
h->e_ident[2] = 0x4c;
h->e_ident[3] = 0x46;
h->e_ident[EI_CLASS] = ELFCLASS64;
h->e_ident[EI_DATA] = ELFDATA2LSB;
h->e_ident[EI_VERSION] = EV_CURRENT;
h->e_ident[EI_OSABI] = ELFOSABI_SYSV;
h->e_type = ET_DYN;
h->e_machine = EM_X86_64;
h->e_version = EV_CURRENT;
h->e_entry = 64;
h->e_phentsize = sizeof(Elf64_Phdr);
h->e_shentsize = sizeof(Elf64_Shdr);
h->e_shoff = 0;
h->e_phoff = 800;
h->e_shnum = 0;
h->e_phnum = 1;
h->e_shstrndx= 0;
h->e_ehsize = sizeof(Elf64_Ehdr);
Elf64_Phdr *ph = (Elf64_Phdr*) &fileBuffer[800];
ph->p_type = PT_LOAD;
ph->p_vaddr = 0x8050000;
ph->p_paddr = 0;
ph->p_offset = 0x4000;
ph->p_memsz = 2058;
ph->p_flags = PF_X | PF_R | PF_W;
ph->p_filesz = 2058;
ph->p_align = 0x100000;
int file = open("ex.out", O_TRUNC | O_RDWR, S_IRWXO | S_IRWXU | S_IRWXG );
int error = write(file, fileBuffer, 2058);
close(file);
}
When I execute the ex.out It gives a segmentation fault and a core dump. This is what the core dump looks like.
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
NOTE 0x0000000000000158 0x0000000000000000 0x0000000000000000
0x0000000000000aac 0x0000000000000000 0x0
LOAD 0x0000000000001000 0x00007f1a823b2000 0x0000000000000000
0x0000000000000000 0x0000000000001000 RWE 0x1000
LOAD 0x0000000000001000 0x00007fff4a1d0000 0x0000000000000000
0x0000000000021000 0x0000000000021000 RWE 0x1000
LOAD 0x0000000000022000 0x00007fff4a1f1000 0x0000000000000000
0x0000000000003000 0x0000000000003000 R 0x1000
LOAD 0x0000000000025000 0x00007fff4a1f4000 0x0000000000000000
0x0000000000002000 0x0000000000002000 R E 0x1000
Displaying notes found at file offset 0x00000158 with length 0x00000aac:
Owner Data size Description
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000088 NT_PRPSINFO (prpsinfo structure)
CORE 0x00000080 NT_SIGINFO (siginfo_t data)
CORE 0x00000140 NT_AUXV (auxiliary vector)
CORE 0x00000048 NT_FILE (mapped files)
Page size: 4096
Start End Page Offset
0x00007f1a823b2000 0x00007f1a823b3000 0x0000000000000004
/home/sudo_user/Compiler/ex.out
CORE 0x00000200 NT_FPREGSET (floating point registers)
LINUX 0x00000440 NT_X86_XSTATE (x86 XSAVE extended state)
description data: ffffffdf 66 f ...
It really did allocate a 4096 bytes page for me, so what could be the problem here? This is the objdump
ex.out: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <.text>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: 48 81 ec 01 00 00 00 sub $0x1,%rsp
b: 8a 85 10 00 00 00 mov 0x10(%rbp),%al
11: 88 85 ff ff ff ff mov %al,-0x1(%rbp)
17: 49 c7 c1 00 00 00 00 mov $0x0,%r9
1e: 49 c7 c0 00 00 00 00 mov $0x0,%r8
25: 49 c7 c2 00 00 00 00 mov $0x0,%r10
2c: 48 c7 c2 00 00 00 00 mov $0x0,%rdx
33: 48 c7 c6 00 00 00 00 mov $0x0,%rsi
3a: 48 c7 c7 00 00 00 00 mov $0x0,%rdi
41: 48 c7 c0 3c 00 00 00 mov $0x3c,%rax
48: 0f 05 syscall
4a: 48 89 ec mov %rbp,%rsp
4d: 5d pop %rbp
4e: c3 retq
Also when I try to run gdb with it, It also gives a segmentation fault:
"Program received signal SIGSEGV, Segmentation fault.
0x00007fffeffae040 in ??"
EDIT
Moreover, if I set p_vaddr to 0 no segmentation fault occurs, however the code doesn't behave like expected. For example, if I put an print instruction on the code, nothing gets printed on the command line when the file is executed.
I'm trying to port an mbed-os (RTX RTOS) project to CC2538 (ARM Cortex M3) which it is compiled using mbed-cli toolchain which integrates arm-none-eabi-gcc. When I try to boot the MCU, I get stuck in Hard Fault error in startup.
00202678 <__libc_init_array>:
202678: b570 push {r4, r5, r6, lr}
20267a: 4e0f ldr r6, [pc, #60] ; (2026b8 <__libc_init_array+0x40>)
20267c: 4d0f ldr r5, [pc, #60] ; (2026bc <__libc_init_array+0x44>)
20267e: 1b76 subs r6, r6, r5
202680: 10b6 asrs r6, r6, #2
202682: bf18 it ne
202684: 2400 movne r4, #0
202686: d005 beq.n 202694 <__libc_init_array+0x1c>
202688: 3401 adds r4, #1
20268a: f855 3b04 ldr.w r3, [r5], #4
20268e: 4798 blx r3
202690: 42a6 cmp r6, r4
202692: d1f9 bne.n 202688 <__libc_init_array+0x10>
202694: 4e0a ldr r6, [pc, #40] ; (2026c0 <__libc_init_array+0x48>)
202696: 4d0b ldr r5, [pc, #44] ; (2026c4 <__libc_init_array+0x4c>)
202698: f004 fec2 bl 207420 <_etext>
20269c: 1b76 subs r6, r6, r5
20269e: 10b6 asrs r6, r6, #2
2026a0: bf18 it ne
2026a2: 2400 movne r4, #0
2026a4: d006 beq.n 2026b4 <__libc_init_array+0x3c>
2026a6: 3401 adds r4, #1
2026a8: f855 3b04 ldr.w r3, [r5], #4
2026ac: 4798 blx r3
2026ae: 42a6 cmp r6, r4
2026b0: d1f9 bne.n 2026a6 <__libc_init_array+0x2e>
2026b2: bd70 pop {r4, r5, r6, pc}
2026b4: bd70 pop {r4, r5, r6, pc}
2026b6: bf00 nop
I got traced the code flow, the final step PC is executing
2026a4: d006 beq.n 2026b4 <__libc_init_array+0x3c>
then
2026b4: bd70 pop {r4, r5, r6, pc}
at this moment, PC get the value 0, then jump to address 0x00000000 and caused
Hard Fault error.
after cpu execute
202678: b570 push {r4, r5, r6, lr}
[register]
R0 =00000000
R1 =00000001
R2 =00000000
R3 =00000002
R4 =00000000
R5 =00000000
R6 =00000000
R7 =00000000
R8 =00000000
R9 =00000000
R10=00000000
R11=00000000
R12=00200F51
SP =200019F0
LR =00200A77
PC =0020267A
[memory]
200019b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
200019c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
200019d0: f0 09 00 20 00 00 00 00 00 00 00 00 04 0a 00 20
200019e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
200019f0: 00 00 00 00 00 00 00 00 00 00 00 00 77 0a 20 00
20001a00: 00 00 00 00 5d 0c 20 00 00 04 00 00 01 01 00 00
before cpu execute
2026b4: bd70 pop {r4, r5, r6, pc}
Debugger dump
[register]
R0 =00000000
R1 =00000001
R2 =00000000
R3 =00000002
R4 =00000000
R5 =00000000
R6 =00000000
R7 =00000000
R8 =00000000
R9 =00000000
R10=00000000
R11=00000000
R12=00200F51
SP =200019C0
LR =0020269D
PC =002026B4
[memory]
200019b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
200019c0: 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
200019d0: 00 00 00 00 9d 26 20 00 02 00 00 00 00 00 00 00
200019e0: 00 00 00 00 00 00 00 00 00 00 00 00 9d 26 20 00
200019f0: 00 00 00 00 00 00 00 00 00 00 00 00 77 0a 20 00
20001a00: 00 00 00 00 5d 0c 20 00 00 04 00 00 01 01 00 00
And if I manually modify StackPointer to 0x200019f0 when pop registers instruction are executed in __libc_init_array.
and I found it will successfully jump to main() at the end.
it seems problem solves.
My question is why stack control goes wrong in __libc_init_array()??
I even can't find the implementation source code of __libc_init_array() function under mbed-os entire project.
attached the .ld file
MEMORY
{
FLASH_FW (rx) : ORIGIN = 0x00200000 + 0,
LENGTH = (0x00200000 + (((((0) << 0 | (512) << 4 | (32) << 16 | ((1) ? 0x01000000 : 0) | ((1) ? 0x02000000 : 0)) & 0x0000FFF0) >> 4) << 10) - 0x0000002C) - (0x00200000 + 0)
FLASH_CCA (RX) : ORIGIN = (0x00200000 + (((((0) << 0 | (512) << 4 | (32) << 16 | ((1) ? 0x01000000 : 0) | ((1) ? 0x02000000 : 0)) & 0x0000FFF0) >> 4) << 10) - 0x0000002C), LENGTH = 0x0000002C
NRSRAM (RWX) : ORIGIN = 0x20000000, LENGTH = 0
FRSRAM (RWX) : ORIGIN = (((((((0) << 0 | (512) << 4 | (32) << 16 | ((1) ? 0x01000000 : 0) | ((1) ? 0x02000000 : 0)) & 0x00FF0000) >> 16) << 10) - ((((((((0) << 0 | (512) << 4 | (32) << 16 | ((1) ? 0x01000000 : 0) | ((1) ? 0x02000000 : 0)) & 0x00FF0000) >> 16) << 10)) < (16384)) ? ((((((0) << 0 | (512) << 4 | (32) << 16 | ((1) ? 0x01000000 : 0) | ((1) ? 0x02000000 : 0)) & 0x00FF0000) >> 16) << 10)) : (16384))) ? 0x20000000 : 0x20004000), LENGTH = (((((0) << 0 | (512) << 4 | (32) << 16 | ((1) ? 0x01000000 : 0) | ((1) ? 0x02000000 : 0)) & 0x00FF0000) >> 16) << 10)
}
/* Linker script to place sections and symbol values. Should be used together
* with other linker script that defines memory regions FLASH and RAM.
* It references following symbols, which must be defined in code:
* Reset_Handler : Entry of reset handler
*
* It defines following symbols, which code can use without definition:
* __exidx_start
* __exidx_end
* __etext
* __data_start__
* __preinit_array_start
* __preinit_array_end
* __init_array_start
* __init_array_end
* __fini_array_start
* __fini_array_end
* __data_end__
* __bss_start__
* __bss_end__
* __end__
* end
* __HeapLimit
* __StackLimit
* __StackTop
* __stack
*/
ENTRY(flash_cca_lock_page)
SECTIONS
{
.text :
{
_text = .;
*(.vectors)
*(.text*)
*(.rodata*)
_etext = .;
} > FLASH_FW= 0
.socdata (NOLOAD) :
{
*(.udma_channel_control_table)
} > FRSRAM
.data : ALIGN(4)
{
_data = .;
*(.data*)
_edata = .;
} > FRSRAM AT > FLASH_FW
_ldata = LOADADDR(.data);
.ARM.exidx :
{
*(.ARM.exidx*)
} > FLASH_FW
.bss :
{
_bss = .;
*(.bss*)
*(COMMON)
_ebss = .;
} > FRSRAM
.heap :
{
__end__ = .;
end = __end__;
*(.heap*)
__HeapLimit = .;
} > RAM
.stack (NOLOAD) :
{
*(.stack)
} > FRSRAM
_heap = .;
_eheap = ORIGIN(FRSRAM) + LENGTH(FRSRAM);
.nrdata (NOLOAD) :
{
_nrdata = .;
*(.nrdata*)
_enrdata = .;
} > NRSRAM
.flashcca :
{
*(.flashcca)
} > FLASH_CCA
}
#notlikethat is right, the problem is link script file, but the root cause is not section overlapping. as I post above. in __libc_init_array()
202678: b570 push {r4, r5, r6, lr}
at this time, stack pointer points to 0x200019F0, but some how when pop operation the stack pointer points to 0x200019C0, and it cause the Hardfault error. I traced the code flow, in __libc_init_array(), it will jump to section <_init> at
202698: f004 fec2 bl 207420 <_etext>
which it is look like that in memory
00207420 <_init>:
207420: b5f8 push {r3, r4, r5, r6, r7, lr}
207422: bf00 nop
00207424 <_fini>:
207426: b5f8 push {r3, r4, r5, r6, r7, lr}
207428: bf00 nop
I'm wondering that this function cause the stack pointer over count, because of the <_init> section only shows push instruction but no pop instruction.
I do more web search about <_init> section, and confirm that is shouldn't be only 2 instructions. and it is effected by linker file.
In the previously linker file, I didn't take care about .init & .fini section.
And then I have done some modified, and it looks like
.text :
{
_text = .;
*(.vectors)
*(.text*)
KEEP(*(.init))
KEEP(*(.fini))
/* .ctors */
*crtbegin.o(.ctors)
*crtbegin?.o(.ctors)
*(EXCLUDE_FILE(*crtend?.o *crtend.o) .ctors)
*(SORT(.ctors.*))
*(.ctors)
/* .dtors */
*crtbegin.o(.dtors)
*crtbegin?.o(.dtors)
*(EXCLUDE_FILE(*crtend?.o *crtend.o) .dtors)
*(SORT(.dtors.*))
*(.dtors)
*(.rodata*)
KEEP(*(.eh_frame*))
_etext = .;
} > FLASH_FW= 0
.socdata (NOLOAD) :
After compile then <_init> & <_fini> section have changed as follow.
00207040 <_init>:
207040: b5f8 push {r3, r4, r5, r6, r7, lr}
207042: bf00 nop
207044: bcf8 pop {r3, r4, r5, r6, r7}
207046: bc08 pop {r3}
207048: 469e mov lr, r3
20704a: 4770 bx lr
0020704c <_fini>:
20704c: b5f8 push {r3, r4, r5, r6, r7, lr}
20704e: bf00 nop
207050: bcf8 pop {r3, r4, r5, r6, r7}
207052: bc08 pop {r3}
207054: 469e mov lr, r3
207056: 4770 bx lr
207058: 6465626d .word 0x6465626d
20705c: 73736120 .word 0x73736120
207060: 61747265 .word 0x61747265
207064: 6e6f6974 .word 0x6e6f6974
207068: 69616620 .word 0x69616620
20706c: 3a64656c .word 0x3a64656c
207070: 2c732520 .word 0x2c732520
207074: 6c696620 .word 0x6c696620
207078: 25203a65 .word 0x25203a65
and then jump into main() successfully.