c/c++: How can I know the size of used flash memory? - c++

I recently faced flash overflow problem. After doing some optimization in code, I saved some flash memory and executed software successfully. I want to how much flash memory is saved through my changes. Please let me know how can I check for used flash / available flash memory. Also I want to how much flash is utilized by particular function/file.
Below mentioned are some info about my developing environment.
- Avr microcontroller with 64 k ram and 512 K flash.
- Using freeRtos.
- Using GNU C++ compiler.
- Using AVRATJTAGEICE for programming and Debugging.
Please let me know the solution.
Regards,
Jagadeep.

GCC's size program is what you're looking for.
size can be passed the full compiled .elf file. It will, by default, output something like this:
$ size linked-file.elf
text data bss dec hex filename
11228 112 1488 12828 321c linked-file.elf
This is saying:
There are 11228 bytes in the .text "section" of this file. This is generally for functions.
There are 112 bytes of initialized data: global variables in the program with initial values.
There are 1488 bytes of uninitialized data: global variables without initial values.
dec is simply the sum of the previous 3 values: 11228 + 112 + 1488 = 12828.
hex is simply the hexadecimal representation of the dec value: 0x321c == 12828.
For embedded systems, generally dec needs to be smaller than the flash size of your target device (or the available space on the device).
It is generally sufficient to simply watch the dec or text outputs of GCC's size command to monitor the size of your compiled code over time. A large jump in size often indicates a poorly implemented new feature or constexpr that are not getting compiled away. (Don't forget function-sections and data-sections).
Note: For AVR's, you'll want to use avr-size for checking the linked size of AVR .elf files. avr-size takes an extra argument of the target chip and will automatically calculate the percentage of used flash for your chosen chip.
GCC's size also works directly on intermediate object files.
This is particularly useful if you want to check the compiled size of functions.
You should see something like this excerpt:
$ size -A main.cpp.o
main.cpp.o :
section size addr
.group 8 0
.group 8 0
.text 0 0
.data 0 0
.bss 0 0
.text._Z8sendByteh 8 0
.text._ZN3XMC5IOpin7setModeENS0_4ModeE 64 0
.text._ZN7NamSpac6OptionIN5Clock4TimeEEmmEi 76 0
.text.Default_Handler 24 0
.text.HardFault_Handler 16 0
.text.SVC_Handler 16 0
.text.PendSV_Handler 16 0
.text.SysTick_Handler 28 0
.text._Z5errorPKc 8 0
.text._ZN7NamSpac5Motor2goEi 368 0
.text._ZN7NamSpac5Motor3getEv 12 0
.rodata.cst1 1 0
.text.startup.main 632 0
.text._ZN7NamSpac7Program3runEv 380 0
.text._ZN7NamSpac8Position4tickEv 24 0
.text.startup._GLOBAL__sub_I__ZN7NamSpac7displayE 292 0
.init_array 4 0
.bss._ZN5Debug9formatterE 4 0
.rodata._ZL10dispDigits 8 0
.bss.position 4 0
.bss.motorState 4 0
.bss.count 4 0
.rodata._ZL9diameters 20 0
.bss._ZN7NamSpac8diameterE 16 0
.bss._ZN5Debug3pinE 12 0
.bss._ZN7NamSpac7displayE 24 0
.rodata.str1.4 153 0
.rodata._ZL12dispSegments 32 0
.bss._ZL16diametersDisplay 10 0
.bss.loadAggregate 4 0
.bss.startCount 4 0
.bss._ZL15runtimesDisplay 10 0
.bss._ZN7NamSpac7runtimeE 16 0
.bss.startTime 4 0
.rodata._ZL8runtimes 20 0
.comment 111 0
.ARM.attributes 49 0
Total 2494

Please let me know the solution.
Sorry, there's no the solution! You've gotta getting through what's linked to your final ELF, and decide if it was linked by intend, or unwanted default.
Please let me know how can I check for used flash / available flash memory.
That primarily depends on your actual target hardware platform, so you have to manage to get your .text section fitting in there.
Also I want to how much flash is utilized by particular function/file.
The nm tool of the GCC binutils provides detailed information about any (global) symbol found in an ELF file and the space it occupies in it's associated section. You'll just need to grep the results for particular functions/classes/namespaces (best demangled!) to accumulate section type and symbol filtered outputs for analysis.
That's the approach, I've been using for a little tool called nmalyzr. Sorry to say, as it stands on the GIT repo, its not really working as intended (I've got working versions, that aren't pushed back).
In general, it's a good strategy to chase for code that has #include <iostream> statements (no matter if std::cout or alike are used or not, static instances are provided!), or unwanted newlib/libstdc++ bindings as for e.g. default exception handling.

Use size command from binutils on the generated elf file. As you seem to use an AVR chip, use avr-size.
To get the size of functions, use nm command from binutils (avr-nm on AVR chips).

Related

Proving that an executable running many times using shared objects is working correctly and using less memory that statically linked version

I have a binary file for a program written in C/C++ that will be run many times on the same server (Linux CentOS 7, x64) in order to simulate a load test on a remote server. The executable can be either statically linked (.a) or dynamically linked (.so) to the dependent libraries.
In order to run as many copies as possible we are trying to run the version that uses dynamic linking in order to share as much memory between the processes as possible, however it appears to be making no difference (if anything it is slightly worse).
For example running 9000 copies of the statically linked process uses 43gb RAM, using dynamically linked uses 48gb RAM).
Using 'pmap -XX' it would appear the shared objects are being used correctly (e.g for one of them, libPocoFoundation.so) :
Address Perm Offset Device Inode Size Rss Pss Shared_Clean Shared_Dirty Private_Clean Private_Dirty Referenced Anonymous AnonHugePages Swap KernelPageSize MMUPageSize Locked VmFlagsMapping
7f6a9498c000 r-xp 00000000 fd:00 26053805 1940 844 422 844 0 0 0 844 0 0 0 4 4 0 rd ex mr mw me sd libPocoFoundation.so
7f6a94b71000 ---p 001e5000 fd:00 26053805 2044 0 0 0 0 0 0 0 0 0 0 4 4 0 mr mw me sd libPocoFoundation.so
7f6a94d70000 r--p 001e4000 fd:00 26053805 60 60 60 0 0 0 60 32 60 0 0 4 4 0 rd mr mw me ac sd libPocoFoundation.so
7f6a94d7f000 rw-p 001f3000 fd:00 26053805 16 16 16 0 0 0 16 16 16 0 0 4 4 0 rd wr mr mw me ac sd libPocoFoundation.so
There is a significant value for Shared_Clean for the libPocoFoundation.so so I think this is working OK.
All the tools I can find to check free memory (meminfo.py, top, free, smem, htop) seem to roughly agree that the amount of memory used and free is the same so I'm inclined to believe the figures.
All libraries (and the executable) are compiled with -fPIC using gcc/g++ 4.9.2 (needs to be an older version as this product is compiled on very old platforms).
Can anyone possibly explain why this isn't working (or at least doesn't appear to be working) or what I am doing wrong?
Many thanks.

How to get stat information from a child process to measure resource utilization?

I feel like this must have a simple answer, but I really don't know how to approach this.
For background, the stack of things is like this:
Python script -> C++ binary -(fork)-> actual thing we want to measure.
Essentially, we have a python script that simulates an environment by using tmp directories and running multiple instances of this network software stack we're developing. The script calls a host binary (which is unimportant here), and then, after it loads, a helper binary. The helper binary can be passed a parameter to daemonize, and when it does this, it forks in the usual way.
What we need to do is measure the daemon's CPU utilization, but I don't really know how to. What I have done is read the stat file periodically, but since the process daemonizes, I can't use echo $! to get its PID. Using ps aux | grep 'thing' works fine, but I think this is giving me the parent process, because the stat information looks like this:
1472582561 9455 (nlsr) S 1 9455 9455 0 -1 4218944 394 0 0 0 13 0 0 0 20 0 2 0 909820 184770560 3851 18446744073709551615 4194304 5318592 140734694817376 140734694810512 140084250723843 0 0 16781312 0 0 0 0 17 0 0 0 0 0 0 7416544 7421528 16224256 140734694825496 140734694825524 140734694825524 140734694825962 0
I know that the parent process should not be PID1, and definitely the utime field and similar should be greater than 13 clock ticks. This is what is leading me to conclude that this process is really the parent process, and not the forked child that's doing all the work.
I can modify pretty much any file necessary, but because of code review constraints, design specs., etc., the less I have to change along many files, the better.
Get the PID of the child reliably
fork() returns the PID of the child to the parent
Get the CPU stats from /proc/[PID]/stat
#14 utime - CPU time spent in user code, measured in clock ticks

Accessing particular section in ELF

We have a program in which a section is added named .proghead.
I can read elf .proghead section data by using following command,
$ readelf -x .proghead elf-binary-file
Hex dump of section '.proghead':
0x0058b960 00112233 00000000 00010000 00000000 .."3............
0x0058b970 15200704 00000000 00016904 00000000 . ........i.....
Now I have to access this section using a C/C++ program.
Can someone please help me in writing C/C++ code to read particular section in elf binary ?
Any help is highly appreciated .
What you need is to read section headers (Elf64_Shdr) to find section names and its offset. The relevant information lies in sh_name and sh_offset fields. So you need to compare sh_name with your required section. On finding required section, you can get its offset(sh_offset) and its size sh_size. Now it is easy for you to get data through loop which reads from sh_offset to sh_offset+sh_offset+sh_size. This is theoretically correct and hope you will get data of required section For further help check following links Get elf sections offsets How to get a pointer to an specific section of a program from within itself? (Maybe with libelf)
You can copy one section of the binary to a text file using the command objcopy from package binutils:
$ objcopy -O binary --only-section=<section> <binary> <output>
So in your case:
$ objcopy -O binary --only-section=.proghead elf-binary-file output.proghead
After that, you can simply code a C++ program that reads a binary file. This approach would work as long as all you need to do is to read that section and not to modify the binary.
If you need to modify the binary, you would need to start reading the section at that sections's offset for size bytes. It's possible to use readelf to know what offset a section starts and its size:
$ readelf --wide -S /bin/ls
There are 28 section headers, starting at offset 0x1c760:
Section Headers:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 0] NULL 0000000000000000 000000 000000 00 0 0 0
[ 1] .interp PROGBITS 0000000000400238 000238 00001c 00 A 0 0 1
[ 2] .note.ABI-tag NOTE 0000000000400254 000254 000020 00 A 0 0 4
[ 3] .note.gnu.build-id NOTE 0000000000400274 000274 000024 00 A 0 0 4
[ 4] .gnu.hash GNU_HASH 0000000000400298 000298 000068 00 A 5 0 8
[ 5] .dynsym DYNSYM 0000000000400300 000300 000c18 18 A 6 1 8
[ 6] .dynstr STRTAB 0000000000400f18 000f18 000593 00 A 0 0 1
However, bear in mind that directly modifying a binary is fine as long as there's no new data added or data removed. Adding new data, will grow a section which results into overriding data of other sections and disorganizing the section index. Shrinking a section and filling up with padding may be OK but doing in the .text section, for instance, may affect the program's logic if there's a jump to a relative direction that no longer exists.
in general, modify the linker command file to give a name to the first address of the .proghead section.
Then, in the C file write a struct to cover the contents of the .proghead section.
Then set a C pointer variable, of the above struct type, to point to the .proghead section.
From then on, that pointer->fieldName will access each of the fields in the struct that is the .proghead section

Core dump note section

Following my question about manually generating a core dump file, I decided to dive into it and get my hands dirty.
I am able to build the basic core dump structure and get my dead program's memory back into the core dump within a big LOAD section. When debugging in GDB, my variables are back, no problem with that.
Here comes the tricky part, how do I get GDB to retrieve information about where the program was when it crashed.
I know that the note section of the core dump contains this information (cpu registers among others). Here is what a objdump -h gives for a "real" core dump :
core.28339: file format elf32-i386
Sections:
Idx Name Size VMA LMA File off Algn
0 note0 000001e8 00000000 00000000 000000f4 2**0
CONTENTS, READONLY
1 .reg/28339 00000044 00000000 00000000 00000150 2**2
CONTENTS
2 .reg 00000044 00000000 00000000 00000150 2**2
CONTENTS
3 .auxv 000000a0 00000000 00000000 0000023c 2**2
CONTENTS
4 load1a 00001000 08010000 00000000 00001000 2**12
CONTENTS, ALLOC, LOAD, READONLY, CODE
.. other load sections ...
I figured out thanks to readelf that those .reg sections contain data mapped from some structures :
Notes at offset 0x000000f4 with length 0x000001e8:
Owner Data size Description
CORE 0x00000090 NT_PRSTATUS (prstatus structure)
CORE 0x0000007c NT_PRPSINFO (prpsinfo structure)
CORE 0x000000a0 NT_AUXV (auxiliary vector)
Can someone give me directions on how is structured the Notes section ?
I tried writing directly those structures to my file, it did not work and I am obviously missing something here.
I looked at the Google Coredumper code and took some bits of it, but writing the note section is not that simple and any detailed information about what it exactly contains and its format are welcomed.
Edit #1 : following 1st comment
I figured out my Elf file should be structured as follows :
Elf header ElfW(Ehdr)
Program headers (Ehdr.e_phnum times ElfW(Phdr)), here i basically used one PT_NOTE and one PT_LOAD headers
Note sections :
Section's header (ElfW(Nhdr))
Section's name (.n_namesz long)
Section's data (.n_descsz long)
Program section containing all my program's memory
Then i will have to put 3 note records, one for the prstatus, one for prpsinfo and one for the auxiliary vector.
This seems to be the right way as readelf gives me a similar output as what I got above with the real core dump.
Edit #2 : after getting the correct structure
I am now struggling with the different structures composing the note records.
Here is what I get when running a eu-readelf --notes on my core dump :
Note segment of 540 bytes at offset 0x74:
Owner Data size Type
CORE 336 PRSTATUS
CORE 136 PRPSINFO
CORE 8 AUXV
NULL
Here is what I get when running the same command on the real core dump :
Note segment of 488 bytes at offset 0xf4:
Owner Data size Type
CORE 144 PRSTATUS
info.si_signo: 11, info.si_code: 0, info.si_errno: 0, cursig: 11
sigpend: <>
sighold: <>
pid: 28339, ppid: 41446, pgrp: 28339, sid: 41446
utime: 0.000000, stime: 0.000000, cutime: 0.000000, cstime: 0.000000
orig_eax: -1, fpvalid: 0
ebx: -1 ecx: 0 edx: 0
esi: 0 edi: 0 ebp: 0xffb9fcbc
eax: -1 eip: 0x08014b26 eflags: 0x00010286
esp: 0xffb9fcb4
ds: 0x002b es: 0x002b fs: 0x0000 gs: 0x0000 cs: 0x0023 ss: 0x002b
CORE 124 PRPSINFO
state: 0, sname: R, zomb: 0, nice: 0, flag: 0x00400400
uid: 9432, gid: 6246, pid: 28339, ppid: 41446, pgrp: 28339, sid: 41446
fname: pikeos_app, psargs: ./pikeos_app
CORE 160 AUXV
SYSINFO: 0xf7768420
SYSINFO_EHDR: 0xf7768000
HWCAP: 0xbfebfbff <fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe>
PAGESZ: 4096
CLKTCK: 100
PHDR: 0x8010034
PHENT: 32
PHNUM: 2
BASE: 0
FLAGS: 0
ENTRY: 0x80100be
UID: 9432
EUID: 9432
GID: 6246
EGID: 6246
SECURE: 0
RANDOM: 0xffb9ffab
EXECFN: 0xffba1feb
PLATFORM: 0xffb9ffbb
NULL
Does someone have any clue or explanations about why my note records are not read properly ?
I thought it might be due to incorrect offsets, but then why would the records be correctly listed ?
Thanks !
Was having same troubles some time ago with my project of converting CRIU images into core dumps. It is fully written in python(even elf structures are in ctypes), so it could be used as a guide. See https://github.com/efiop/criu-coredump .I.e. how everything is structured could be seen here https://github.com/efiop/criu-coredump/blob/master/criu_coredump/core_dump.py .
Can someone give me directions on how is structured the Notes section?
The notes section is a concatenation of variable-sized note records. Each note record begins with ElfW(Nhdr) structure, followed by (variable sized) name (of length .n_namesz, padded so total size of name on disk is divisible by 4) and data (of length .n_descsz, similarly padded).
After some tests I figured things out, answering for anyone looking for this information :
Can someone confirm I am going the right way structuring my Elf file this way ?
Yes.
As GDB is accepting the file, this seems to be the right way of doing. Results shown by readelf -a show the correct structure, good so far.
I am not sure about where should lay the data (note & program sections) into my file : is there a mandatory order, or is this my program headers offset that define where the data is ?
Offsets given to Phdr.p_offset should point where the data lays in the Elf file. They start at the very beginning of the file.
For example :
The p_offset for the PT_NOTE program header should be set at sizeof(ElfW(Ehdr)) + ehdr.e_phnum*sizeof(ElfW(Phdr)). ehdr.e_phnum being the number of program header present in the Elf file.
For the PT_LOAD program header, this is a bit longer, cause we will also have to add length of all the note sections. For a "standard" core dump with a note segment containg NT_PRSTATUS, NT_PRPSINFO and NT_AUXV sections, offset for the PT_LOAD data (Phdr.p_offset) will be :
sizeof(ElfW(Ehdr)) + ehdr.e_phnum*sizeof(ElfW(Phdr))
+ sizeof(ElfW(Nhdr)) + sizeof(name_of_section) + sizeof(struct prstatus)
+ sizeof(ElfW(Nhdr)) + sizeof(name_of_section) + sizeof(struct prpsinfo)
+ sizeof(ElfW(Nhdr)) + sizeof(name_of_section) + sizeof(struct auxv_t)

Duplicate GLX FBConfigs from glXChooseFBConfig

When running glxinfo, or using my own code (calling glXChooseFBConfig to get a list of GLX framebuffer configurations), I see that there are entries which are identical except for their ID code.
For example:
$ glxinfo
...
0x77 0 tc 0 32 0 r y . 8 8 8 8 4 24 8 16 16 16 16 0 0 None
...
0xae 0 tc 0 32 0 r y . 8 8 8 8 4 24 8 16 16 16 16 0 0 None
...
What is the reason for this duplication? Is there an underlying difference between these seemingly identical modes?
Though they have the same buffer configuration from the point of view of OpenGL, they differ from the point of view of X11. Specifically, they have different X Visual bit-depths (one is a 24-bit X visual and one is a 32-bit X visual)
[note: I found this out while I was composing the question, but since I hadn't found an answer in my web searches I'm posting the question and answer here anyway -- maybe someone else will find it useful in the future.]