I have a linker script like this:
OUTPUT_FORMAT(binary)
SECTIONS
{
. = 0xFFFF800000000000 ;
.startup_text : { processor.o(.text) }
.text : { *(EXCLUDE_FILE (processor.o) .text) }
.data : { *(.data) }
.rodata : { *(.rodata) }
linker_first_free_page = ALIGN(4096);
}
A piece of code loads the executable generated by this script, printing these infos:
size of executable (pages) 3
first free page 0xffff800000003000
And the executable itself prints:
&(linker_first_free_page) 0xffff800000003000
So everything works good so far. Now my executable needs a .bss section. Note that I don't have a loader capable of loading elf files, so I need a flat binary that can be just read and used, with all sections inside.
First attempt
OUTPUT_FORMAT(binary)
SECTIONS
{
. = 0xFFFF800000000000 ;
.startup_text : { processor.o(.text) }
.text : { *(EXCLUDE_FILE (processor.o) .text) }
.data : { *(.data) }
.rodata : { *(.rodata) }
.bss : { *(.bss) }
linker_first_free_page = ALIGN(4096);
}
That is, simply adding a .bss section. This is the output:
size of executable (pages) 3
first free page 0xffff800000003000
&(linker_first_free_page) 0xffff800000004000
That is, the linker variable is correctly updated, but the section is not allocated (I guess this is pretty normal for a .bss section).
Second attempt
OUTPUT_FORMAT(binary)
SECTIONS
{
. = 0xFFFF800000000000 ;
.startup_text : { processor.o(.text) }
.text : { *(EXCLUDE_FILE (processor.o) .text) }
.data : { *(.data) *(.bss) }
.rodata : { *(.rodata) }
linker_first_free_page = ALIGN(4096);
}
That is, putting .bss section inside the .data one. This is the output:
size of executable (pages) 4
first free page 0xffff800000004000
&(linker_first_free_page) 0xffff800000003000
That is, the .bss is allocated, but the linker variable is not updated (and I can't understand why...)
Short question
So, given all above, how can I embed the .bss section in a flat binary, so that it can be loaded in memory like a "standard" file and used directly, without a specific loader?
Related
I'm working on the Upsilon Project (https://github.com/UpsilonNumworks/Upsilon) and I have some trouble to implement sram data retention after reset (using nrst pin).
I want to preserve the data on the staticStorageArea, so I did this:
extern "C" {
extern char _bss_records_section_start;
}
static volatile uint32_t * staticStorageArea[sizeof(Storage)/sizeof(uint32_t)] __attribute__((section(".noinit"))) ;
And there is my linker script:
MEMORY {
INTERNAL_FLASH (rx) : ORIGIN = 0x00200000, LENGTH = 64K
SRAM (rw) : ORIGIN = 0x20000000, LENGTH = 256K - 0xfa20
NOINIT (rwx) : ORIGIN = 0x20000000 + 256K - 0xfa20, LENGTH = 0xfa20
EXTERNAL_FLASH (rx) : ORIGIN = 0x90000000, LENGTH = 8M
}
STACK_SIZE = 32K;
FIRST_EXTERNAL_FLASH_SECTOR_SIZE = 4K;
SECTIONS {
.isr_vector_table ORIGIN(INTERNAL_FLASH) : {
KEEP(*(.isr_vector_table))
} >INTERNAL_FLASH
.header : {
KEEP(*(.header))
} >INTERNAL_FLASH
.text.internal_to_external : {
...
} >INTERNAL_FLASH
...
} >INTERNAL_FLASH
.rodata.internal : {
...
} >INTERNAL_FLASH
.exam_mode_buffer ORIGIN(EXTERNAL_FLASH) : {
...
} >EXTERNAL_FLASH
/* External flash memory */
.text.external : {
. = ALIGN(4);
*(.text)
*(.text.*)
} >EXTERNAL_FLASH
.rodata.external : {
*(.rodata)
*(.rodata.*)
} >EXTERNAL_FLASH
.init_array : {
...
} >INTERNAL_FLASH
.data : {
...
} >SRAM AT> INTERNAL_FLASH
.bss : {
...
} >SRAM
.heap : {
...
} >SRAM
.stack : {
...
} >SRAM
.noinit (NOLOAD) : ALIGN(4) {
_bss_records_section_start = .;
KEEP(*(.noinit))
KEEP(*(.noinit*))
_bss_records_section_end = .;
} >NOINIT
/DISCARD/ : {
...
}
}
NOCROSSREFS_TO(.text.external .text.internal);
NOCROSSREFS_TO(.rodata.external .text.internal);
NOCROSSREFS_TO(.text.external .rodata.internal);
NOCROSSREFS_TO(.rodata.external .rodata.internal);
NOCROSSREFS_TO(.text.external .isr_vector_table);
NOCROSSREFS_TO(.rodata.external .isr_vector_table);
NOCROSSREFS_TO(.text.external .header);
NOCROSSREFS_TO(.rodata.external .header);
NOCROSSREFS_TO(.exam_mode_buffer .text.internal);
NOCROSSREFS_TO(.exam_mode_buffer .rodata.internal);
NOCROSSREFS_TO(.exam_mode_buffer .isr_vector_table);
NOCROSSREFS_TO(.exam_mode_buffer .header);
I noticed that after every reset the data in the .noinit section is like corrupted, and I do not understand why... maybe I didn't understand correctly what nrst does in fact ? (on the website it's wrote that the power is "virtually disconnected")
I found a start up code for arm cortex m cores on internet and using those sources but I have some doubts regarding a function from the sources and here I am pasting the code and the respective linker scripts being used here.
// very simple startup code with definition of handlers for all cortex-m cores
// location of these variables is defined in linker script
extern unsigned __data_load;
extern unsigned __data_start;
extern unsigned __data_end;
extern unsigned __bss_start;
extern unsigned __bss_end;
extern unsigned __heap_start;
extern unsigned __init_array_start;
extern unsigned __init_array_end;
extern unsigned __fini_array_start;
extern unsigned __fini_array_end;
// main application
extern void main_app();
void copy_data() {
unsigned *src = &__data_load;
unsigned *dst = &__data_start;
while (dst < &__data_end) {
*dst++ = *src++;
}
}
void zero_bss() {
unsigned *dst = &__bss_start;
while (dst < &__bss_end) {
*dst++ = 0;
}
}
void fill_heap(unsigned fill=0x55555555) {
unsigned *dst = &__heap_start;
register unsigned *msp_reg;
__asm__("mrs %0, msp\n" : "=r" (msp_reg) );
while (dst < msp_reg) {
*dst++ = fill;
}
}
void call_init_array() {
unsigned *tbl = &__init_array_start;
while (tbl < &__init_array_end) {
((void (*)())*tbl++)();
}
}
void call_fini_array() {
unsigned *tbl = &__fini_array_start;
while (tbl < &__fini_array_end) {
((void (*)())*tbl++)();
}
}
// reset handler
void RESET_handler() {
copy_data();
zero_bss();
fill_heap();
call_init_array();
// run application
main_app();
// call destructors for static instances
call_fini_array();
// stop
while (true);
}
Following is the linker description being used
SECTIONS {
. = ORIGIN(FLASH);
.text : {
KEEP(*(.stack))
KEEP(*(.vectors))
KEEP(*(.vectors*))
KEEP(*(.text))
. = ALIGN(4);
*(.text*)
. = ALIGN(4);
KEEP(*(.rodata))
*(.rodata*)
. = ALIGN(4);
} >FLASH
.init_array ALIGN(4): {
__init_array_start = .;
KEEP(*(.init_array))
__init_array_end = .;
} >FLASH
.fini_array ALIGN(4): {
__fini_array_start = .;
KEEP(*(.fini_array))
__fini_array_end = .;
} >FLASH
}
SECTIONS {
__stacktop = ORIGIN(SRAM) + LENGTH(SRAM);
__data_load = LOADADDR(.data);
. = ORIGIN(SRAM);
.data ALIGN(4) : {
__data_start = .;
*(.data)
*(.data*)
. = ALIGN(4);
__data_end = .;
} >SRAM AT >FLASH
.bss ALIGN(4) (NOLOAD) : {
__bss_start = .;
*(.bss)
*(.bss*)
. = ALIGN(4);
__bss_end = .;
*(.noinit)
*(.noinit*)
} >SRAM
. = ALIGN(4);
__heap_start = .;
}
My question is in the copy_data() function why do we need to assign the address of __data_load to a pointer *src? Is __data_load = LOADADDR(.data); is same as __data_start. What does the copy_data() function doing in the program? Thanks in advance.
The linker script instructs the linker to place the data in flash but link the code as if the data is in ram. In the startup code the data is then copied from the address the data is loaded at (the flash) to the address the data is supposed to be (RAM).
copy_data() copies memory reading from start address __data_load into the adress range starting from __data_start to __data_end
The total size copied is therefore __data_end - __data_start.
Of course you already have the data available at __data_load. The program copies it from FLASH to SRAM where it can be read and written as much as needed.
The problem
We have an issue in embedded code. Storage options are non-volatile but unmodifiable (flash/ROM, etc) or volatile modifiable storage. 'data' is initialized non-zero arbitrary values which can be modified (versus const or rodata). How can this be arranged?
A copy of the data is put in flash or ROM. This data is then copied to RAM where it is both read and written.
My question is in the copy_data() function why do we need to assign the address of __data_load to a pointer *src? Is __data_load = LOADADDR(.data); is same as __data_start. What does the copy_data() function doing in the program?
copy_data() is the solution to the above problem. It takes memory from the flash (load location) and copies it to RAM. A similar dichotomy can exist with virtual addressing. Where you need to arrange a physical address and virtual address contents to be the same before enabling an MMU. Linker documentation often calls the run/RAM location a 'VADDR'.
With an OS or some ROM bootloaders you may load from disk/MMC (NAND flash) to RAM and be able to circumvent copy_data(). It is only needed if your code will run directly from a non-volatile device. It can often be faster and simpler just to copy the entire image from flash to RAM. This depends on resources of course. Read access from RAM is often faster than flash. Again this will depend on your system.
I have written a startup and liker script for my C++ application, running on STM32F407VG.
The problem is i have an array of structure, where the structure field str is always zero despite the initialization. The other field in the struct are correctly initialized. I can-t understand what I'm doing wrong, I guess some part of initialization in the startup script is missing.
The array is declared like the following:
struct elem{
uint32_t str;
uint32_t value;
uint32_t value2;
};
const struct elem array[]{
{(uint32_t)(*(uint32_t*)"CM1"), 1, 1},
{(uint32_t)(*(uint32_t*)"CM2"), 2, 2},
{(uint32_t)(*(uint32_t*)"CM3"), 3, 3}
};
relevant section of startup script:
inline void static_init()
{
for (void (**p)() = __preinit_array_start; p < __preinit_array_end; ++p)
(*p)();
for (void (**p)() = __init_array_start; p < __init_array_end; ++p)
(*p)();
}
void reset_handler(void)
{
unsigned long *source;
unsigned long *destination;
// Copying data from Flash to RAM
source = &_data_flash;
for (destination = &_data_begin; destination < &_data_end;)
{
*(destination++) = *(source++);
}
// default zero to undefined variables
for (destination = &_bss_begin; destination < &_bss_end;)
{
*(destination++) = 0;
}
static_init();
#ifndef __NO_SYSTEM_INIT
SystemInit();
#endif
// starting main program
main();
}
and the linker script:
/* Entry Point */
ENTRY(reset_handler)
_estack = 0x20010000; /* end of 128K RAM */
/* Specify the memory areas */
/*
0x08000000 until 0x08010000 is reserved for BOOTLOADER! (64k)
*/
MEMORY
{
EEPROM (rwx) : ORIGIN = 0x08010000, LENGTH = 64K /*fake EEPROM!*/
FLASH (rx) : ORIGIN = 0x08020000, LENGTH = 896K
RAM (xrw) : ORIGIN = 0x20000000, LENGTH = 128K
RAM2 (rw) : ORIGIN = 0x10000000, LENGTH = 64K
}
SECTIONS
{
/* The startup code goes first into FLASH */
.isr_vector :
{
. = ALIGN(4);
__intvec_start__ = .;
KEEP(*(.isr_vector)) /* Startup code */
. = ALIGN(4);
} >FLASH
/* The program code and other data goes into FLASH */
.text :
{
. = ALIGN(4);
_text = .;
*(.text) /* .text sections (code) */
_text2 = .;
*(.text*) /* .text* sections (code) */
_rodata = .;
*(.rodata) /* .rodata sections (constants, strings, etc.) */
*(.rodata*) /* .rodata* sections (constants, strings, etc.) */
*(.glue_7) /* glue arm to thumb code */
*(.glue_7t) /* glue thumb to arm code */
*(.eh_frame)
_init_data = .;
KEEP (*(.init))
KEEP (*(.fini))
. = ALIGN(4);
_etext = .; /* define a global symbols at end of code */
} > FLASH
.ARM.extab : { *(.ARM.extab* .gnu.linkonce.armextab.*) } >FLASH
.ARM : {
__exidx_start = .;
*(.ARM.exidx*)
__exidx_end = .;
} >FLASH
.preinit_array :
{
PROVIDE_HIDDEN (__preinit_array_start = .);
KEEP (*(.preinit_array*))
PROVIDE_HIDDEN (__preinit_array_end = .);
} >FLASH
.init_array :
{
PROVIDE_HIDDEN (__init_array_start = .);
KEEP (*(SORT(.init_array.*)))
KEEP (*(.init_array*))
PROVIDE_HIDDEN (__init_array_end = .);
} >FLASH
.fini_array :
{
PROVIDE_HIDDEN (__fini_array_start = .);
KEEP (*(SORT(.fini_array.*)))
KEEP (*(.fini_array*))
PROVIDE_HIDDEN (__fini_array_end = .);
} >FLASH
/* used by the startup to initialize data */
_sidata = LOADADDR(.data);
/* used by the startup to initialize data */
_data_flash = _sidata;
/* Initialized data sections goes into RAM, load LMA copy after code */
.data :
{
. = ALIGN(4);
_data_begin = .;
*(.data)
*(.data*)
. = ALIGN(4);
_data_end = .;
} >RAM AT> FLASH
.bss (NOLOAD) :
{
. = ALIGN(4);
_bss_begin = .;
__bss_start__ = _bss_begin;
*(.bss)
*(.bss*)
*(COMMON)
. = ALIGN(4);
_bss_end = .;
__bss_end__ = _bss_end;
} > RAM
stack_size = 1024;
__stack_end__ = ORIGIN(RAM)+LENGTH(RAM);
__stack_start__ = __stack_end__ - stack_size;
heap_size = 0;
__heap_end__ = __stack_start__;
__heap_start__ = __heap_end__ - heap_size;
. = __stack_start__;
._stack :
{
PROVIDE ( end = . );
. = . + stack_size;
. = . + heap_size;
. = ALIGN(4);
} > RAM
_siccmram = LOADADDR(.ram2);
.ram2 (NOLOAD) :
{
. = ALIGN(4);
*(.ram2);
*(.ram2*);
. = ALIGN(4);
} > RAM2 AT> FLASH
/* Remove information from the standard libraries */
/DISCARD/ :
{
libc.a ( * )
libm.a ( * )
libgcc.a ( * )
}
.ARM.attributes 0 : { *(.ARM.attributes) }
}
This should work and have no UB.
However, it's endian dependent.
#include <iostream>
#include <cstdint>
#include <cstring>
using namespace std;
struct elem {
uint32_t str;
uint32_t value;
uint32_t value2;
};
uint32_t makeint(const char str[4])
{
uint32_t val;
memcpy( &val, str, 4 );
return val;
}
const elem arr[] = {
{makeint("CM1"), 1, 1},
{makeint("CM2"), 2, 2},
{makeint("CM3"), 3, 3}
};
int main()
{
for (auto& e : arr)
cout << e.str << endl;
cout << "\ndone\n";
}
See it here
You might use multicharacter literal: see (6.) of character_literal.
Notice single quotes:
const struct elem array[]{
{'CM1', 1, 1},
{'CM2', 2, 2},
{'CM3', 3, 3}
};
You can see how gcc evaluate multicharacter literal:
https://gcc.gnu.org/onlinedocs/cpp/Implementation-defined-behavior.html#Implementation-defined-behavior
The compiler evaluates a multi-character character constant a character at a time, shifting the previous value left by the number of bits per target character, and then or-ing in the bit-pattern of the new character truncated to the width of a target character. The final bit-pattern is given type int, and is therefore signed, regardless of whether single characters are signed or not. If there are more characters in the constant than would fit in the target int the compiler issues a warning, and the excess leading characters are ignored.
I'm trying to debug a program on LH75401 device with ARM7TDMI core using GDB. When I invoke "load" command, GDB loads only ".text" output section. How do I make it so that it loads not only the ".text" section but some other sections too? I tried to use PHDRS command in the linker script to make some sections loadable but it did not help. Here is my linker script:
USR_STACK_SIZE = 0x100;
IRQ_STACK_SIZE = 0x100;
MEMORY
{
EXTROM(wx) : ORIGIN = 0x44000000, LENGTH = 0x100000
EXTRAM(wx) : ORIGIN = 0x48000000, LENGTH = 0x100000
INTRAM(wx) : ORIGIN = 0x60000000, LENGTH = 0x4000
TCMRAM(wx) : ORIGIN = 0x80000000, LENGTH = 0x4000
}
PHDRS
{
EXTROM PT_LOAD;
EXTRAM PT_LOAD;
INTRAM PT_LOAD;
TCMRAM PT_LOAD;
}
ENTRY(rst_handler)
SECTIONS
{
.vect :
{
*(.vect)
}
> TCMRAM AT
> TCMRAM
: TCMRAM
.text :
{
*(.text)
*(.rodata)
}
> EXTROM AT
> EXTROM
: EXTROM
.data :
{
*(.data)
}
> EXTRAM AT
> EXTRAM
: EXTRAM
.bss :
{
*(.bss)
}
> EXTRAM AT
> EXTRAM
: EXTRAM
.usr_stack :
{
. += USR_STACK_SIZE;
. = ALIGN(8);
}
> EXTRAM AT
> EXTRAM
: EXTRAM
.irq_stack :
{
. += IRQ_STACK_SIZE;
. = ALIGN(8);
}
> EXTRAM AT
> EXTRAM
: EXTRAM
}
Fortunately I solved this problem. It was necessary to add a special section attribute in the source file. The attribute is "a", meaning that the section is allocatable. That's all.
I would like to run my C++ code on a ChibiOS, I can compile and link the code if I'm replacing the new delete with the C functions malloc and free. But I still would like to fix the issue. When I'm using new delete I'm receiving the following error:
no memory region specified for loadable section `.ARM.extab'
This is the linking script
/*
* BCM2835 memory setup.
*/
__und_stack_size__ = 0x0004;
__abt_stack_size__ = 0x0004;
__fiq_stack_size__ = 0x0010;
__irq_stack_size__ = 0x0080;
__svc_stack_size__ = 0x0004;
__sys_stack_size__ = 0x0400;
__stacks_total_size__ = __und_stack_size__ + __abt_stack_size__ + __fiq_stack_size__ + __irq_stack_size__ + __svc_stack_size__ + __sys_stack_size__;
MEMORY
{
ram : org = 0x8000, len = 0x06000000 - 0x10021
extabram : org = 0x06008000 - 0x10020, len = 0x10000 - 0x20
}
__ram_start__ = ORIGIN(ram);
__ram_size__ = LENGTH(ram);
__ram_end__ = __ram_start__ + __ram_size__;
SECTIONS
{
. = 0;
.text : ALIGN(16) SUBALIGN(16)
{
_text = .;
KEEP(*(vectors))
*(.text)
*(.text.*)
*(.rodata)
*(.rodata.*)
*(.glue_7t)
*(.glue_7)
*(.gcc*)
*(.ctors)
*(.dtors)
} > ram
.ARM.extab : {*(.ARM.extab* .gnu.linkonce.armextab.*)} > extabram
__exidx_start = .;
.ARM.exidx : {*(.ARM.exidx* .gnu.linkonce.armexidx.*)} > ram
__exidx_end = .;
.eh_frame_hdr : {*(.eh_frame_hdr)}
.eh_frame : ONLY_IF_RO {*(.eh_frame)}
. = ALIGN(4);
_etext = .;
_textdata = _etext;
.data :
{
_data = .;
*(.data)
. = ALIGN(4);
*(.data.*)
. = ALIGN(4);
*(.ramtext)
. = ALIGN(4);
_edata = .;
} > ram
.bss :
{
_bss_start = .;
*(.bss)
. = ALIGN(4);
*(.bss.*)
. = ALIGN(4);
*(COMMON)
. = ALIGN(4);
_bss_end = .;
} > ram
}
PROVIDE(end = .);
_end = .;
__heap_base__ = _end;
__heap_end__ = __ram_end__ - __stacks_total_size__;
__main_thread_stack_base__ = __ram_end__ - __stacks_total_size__;
what does it missing in the linker script?
EDITED
After I followed Chris Desjardins answer, I'm receiving the following errors:
Linking build/ch.elf
/home/robu/CodeSourcery/Sourcery_CodeBench_Lite_for_ARM_EABI/bin/../lib/gcc/arm-none-eabi/4.7.2/../../../../arm-none-eabi/lib/libc.a(lib_a-abort.o): In function `abort':
abort.c:(.text+0x10): undefined reference to `_exit'
/home/robu/CodeSourcery/Sourcery_CodeBench_Lite_for_ARM_EABI/bin/../lib/gcc/arm-none-eabi/4.7.2/../../../../arm-none-eabi/lib/libc.a(lib_a-fstatr.o): In function `_fstat_r':
fstatr.c:(.text+0x1c): undefined reference to `_fstat'
/home/robu/CodeSourcery/Sourcery_CodeBench_Lite_for_ARM_EABI/bin/../lib/gcc/arm-none-eabi/4.7.2/../../../../arm-none-eabi/lib/libc.a(lib_a-sbrkr.o): In function `_sbrk_r':
sbrkr.c:(.text+0x18): undefined reference to `_sbrk'
/home/robu/CodeSourcery/Sourcery_CodeBench_Lite_for_ARM_EABI/bin/../lib/gcc/arm-none-eabi/4.7.2/../../../../arm-none-eabi/lib/libc.a(lib_a-signalr.o): In function `_kill_r':
signalr.c:(.text+0x1c): undefined reference to `_kill'
/home/robu/CodeSourcery/Sourcery_CodeBench_Lite_for_ARM_EABI/bin/../lib/gcc/arm-none-eabi/4.7.2/../../../../arm-none-eabi/lib/libc.a(lib_a-signalr.o): In function `_getpid_r':
signalr.c:(.text+0x44): undefined reference to `_getpid'
/home/robu/CodeSourcery/Sourcery_CodeBench_Lite_for_ARM_EABI/bin/../lib/gcc/arm-none-eabi/4.7.2/../../../../arm-none-eabi/lib/libc.a(lib_a-writer.o): In function `_write_r':
writer.c:(.text+0x20): undefined reference to `_write'
/home/robu/CodeSourcery/Sourcery_CodeBench_Lite_for_ARM_EABI/bin/../lib/gcc/arm-none-eabi/4.7.2/../../../../arm-none-eabi/lib/libc.a(lib_a-closer.o): In function `_close_r':
closer.c:(.text+0x18): undefined reference to `_close'
/home/robu/CodeSourcery/Sourcery_CodeBench_Lite_for_ARM_EABI/bin/../lib/gcc/arm-none-eabi/4.7.2/../../../../arm-none-eabi/lib/libc.a(lib_a-isattyr.o): In function `_isatty_r':
isattyr.c:(.text+0x18): undefined reference to `_isatty'
/home/robu/CodeSourcery/Sourcery_CodeBench_Lite_for_ARM_EABI/bin/../lib/gcc/arm-none-eabi/4.7.2/../../../../arm-none-eabi/lib/libc.a(lib_a-lseekr.o): In function `_lseek_r':
lseekr.c:(.text+0x20): undefined reference to `_lseek'
/home/robu/CodeSourcery/Sourcery_CodeBench_Lite_for_ARM_EABI/bin/../lib/gcc/arm-none-eabi/4.7.2/../../../../arm-none-eabi/lib/libc.a(lib_a-readr.o): In function `_read_r':
readr.c:(.text+0x20): undefined reference to `_read'
collect2: error: ld returned 1 exit status
make: *** [build/ch.elf] Error 1
It seems that the ARM.extab section doesn't have a place in memory. In this example they put it in flash.
http://hertaville.com/2012/06/29/a-sample-linker-script/
But you should be ok just carving out some ram for it as follows:
MEMORY
{
ram : org = 0x8000, len = 0x06000000 - 0x10021
extabram : org = 0x06008000 - 0x10020, len = 0x10000 - 0x20
}
Then put the ARM.extab section in the new ram region:
.ARM.extab : {*(.ARM.extab* .gnu.linkonce.armextab.*)} > extabram
You could also just try to put it into the normal ram section...
.ARM.extab : {*(.ARM.extab* .gnu.linkonce.armextab.*)} > ram
.ARM.extab section is created when exception support is required.
One of the differences between malloc and new is that new cannot return NULL: it must throw an exception in the event of memory allocation failure.
Would compiling with --force_new_nothrow option help?