simavr ignores memory content of located sections - gdb

Question also posted in https://github.com/buserror/simavr/issues/484
For a project with an Atmega324P I use special sections to ensure that some code is located on desired addresses. Works fine on real target, but these sections are not loaded proper in simavr.
Any idea what is going wrong and how to solve this problem?
To reproduce this problem I use an Ubuntu system with avr-gcc (5.4.0), avr-gdb (10.1.90.20210103-git) and simavr (1.6+dfsg-3).
Create file main.c
#define ATT_SECTION_APP __attribute__((section(".app")))
int main () ATT_SECTION_APP;
int main () {
asm("nop");
return 0;
}
Build project and locate section .app to 0x7100
avr-gcc -c -o main.o main.c
avr-gcc -o main.elf -Wl,-section-start=.app=0x7100 -mmcu=atmega324p main.o
Check result in elf-file
Disassemble elf-file with avr-objdump -d main.elf
Disassembly of section .app:
00007100 <main>:
7100: cf 93 push r28
7102: df 93 push r29
7104: cd b7 in r28, 0x3d ; 61
7106: de b7 in r29, 0x3e ; 62
7108: 00 00 nop
710a: 80 e0 ldi r24, 0x00 ; 0
710c: 90 e0 ldi r25, 0x00 ; 0
710e: df 91 pop r29
7110: cf 91 pop r28
7112: 08 95 ret
Start simavr
simavr -g -m atmega324p main.elf
Start avr-gdb and show memory on location 0x7100
Start in another shell avr-gdb with avr-gdb main.elf.
Then execute the following commands on gdb console:
target remote localhost:1234
x/20b 0x7100
As you can see, the memory locations starting with 0x7100 are cleared (= 0xff) instead of showing the same content as in disassembly view. Symbols like main are shown as desired.
(gdb) target remote localhost:1234
Remote debugging using localhost:1234
0x00000000 in __vectors ()
(gdb) x/20b 0x7100
0x7100 <main>: 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff
0x7108 <main+8>: 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff
0x7110 <main+16>: 0xff 0xff 0xff 0xff

Related

Valgrind reports SIGILL in std::string::swap

I am using valgrind 3.16 to debug my program, and it reports illegal instruction in std::string::swap. The program is compiled on Ubuntu 18.04 with g++ 7.5.0.
vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0xFE 0x8 0x6F 0x47 0x1 0xC5 0xF8 0x11
vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0
vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE
vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0
==392550== valgrind: Unrecognised instruction at address 0x3fef89.
==392550== at 0x3FEF89: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::swap(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) (in /tmp/tmp.BnUAMaceSS/cmake-build-release-parallel-chameleon/release/lqf-tpch-query-dev)
==392550== by 0x4F4CD0C: std::__cxx11::basic_stringbuf<char, std::char_traits<char>, std::allocator<char> >::overflow(int) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
I have two questions:
Is there a website I can query instructions using the given OP code? I tried this website but cannot find anything corresponding to 0x62 0xF1 0xFE ...
Why would valgrind reports SIGILL in std library?
The code works well by itself. I also applied undefined,thread,leak and address sanitizer, and stack-check. They report no error. So I think the problem is from valgrind.
I really appreciate #TedLyngmo's comments of applying sanitizers. Although the sanitizer itself does not give any error, after applying sanitizer the error location changes. Now it is pointing to some of my code base which uses AVX 512 instructions. When I replace those instructions Valgrind can run.
Although I am still not sure why applying sanitizer will have such effect, I would like to share this experience with those who may encounter similar errors in the future.
Question 1: convert opcode to assembly
This website can disassemble opcode to assembly online for x86 or x64. For the opcode in your question, insert 62 F1 FE 08 6F 47 01 C5 F8 11 (note that use 08 for 0x8) to the text box under disassemble, then it outputs
Disassembly:
0: 62 f1 fe 08 6f 47 01 vmovdqu64 xmm0,XMMWORD PTR [rdi+0x10]
7: c5 .byte 0xc5
8: f8 clc
9: 11 .byte 0x11
So the instruction is vmovdqu64which is an AVX512 instruction.
Or you it can be hacked as following
$ cat instruct.c
#include <stdio.h>
int main(void)
{
asm(".byte 0x62, 0xF1, 0xFE, 0x8, 0x6F, 0x47, 0x1, 0xC5, 0xF8, 0x11");
return 0;
}
$ gcc -c instruct.c
$ objdump -drwC -Mintel instruct.o
instruct.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <main>:
0: f3 0f 1e fa endbr64
4: 55 push rbp
5: 48 89 e5 mov rbp,rsp
8: 62 f1 fe 08 6f 47 01 vmovdqu64 xmm0,XMMWORD PTR [rdi+0x10]
f: c5 f8 11 b8 00 00 00 00 vmovups XMMWORD PTR [rax+0x0],xmm7
17: 5d pop rbp
18: c3 ret
In order for valgrind to run a program with AVX512 instructions, one can compile the program with gcc flag -mno-avx512f as documented here which points out that extension instructions can be enabled with flag (e.g. -mavx512f) and disabled with a corresponding -mno- option.
Another way to use valgrind on a program with AVX512 instructions is to build Valgind from source and apply patches from here. Following are steps for it:
Clone valgrind source code
$ git clone https://sourceware.org/git/valgrind.git
$ cd valgrind
Download patches from here
Copy the patches to local files (e.g. patch512-part1.patch inside
folder
valgrind). Currently there are
four patches, and they look quite messy due to the fact that the author
keeps
updating it (which is good) and obsoleting old ones. One should also apply the
patches on top of right commit since new commits from valgrind master branch after the patches may break the
build system. For example, up to now patches from comment 58, 73, 60 and 74 and
on top of commit b77dbefe72e4a5c7bcf1576a02c909010bd56991 can be compiled
successfully.
$ git reset --hard b77dbefe72e4a5c7bcf1576a02c909010bd56991
Apply patches
$ patch -p1 < patch512-part1.patch
$ patch -p1 < patch512-part2.patch
$ patch -p1 < patch512-part3.patch
$ patch -p1 < patch512-part4.patch
Compile valgrind
$ ./autogen.sh
$ ./configure --prefix=<install-path>
$ make
$ make install

Why file command report dynamic linked with gcc -static

gcc -static -g -O2 -static -o init init-init.o
file init
# init: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, with debug_info, not stripped
ldd init
# ldd (0x7fd49e2ed000)
objdump -p init
init: file format elf64-x86-64
Program Header:
LOAD off 0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**21
filesz 0x00000000000076e4 memsz 0x00000000000076e4 flags r-x
LOAD off 0x0000000000007e30 vaddr 0x0000000000207e30 paddr 0x0000000000207e30 align 2**21
filesz 0x00000000000002d8 memsz 0x0000000000001488 flags rw-
DYNAMIC off 0x0000000000007e60 vaddr 0x0000000000207e60 paddr 0x0000000000207e60 align 2**3
filesz 0x0000000000000150 memsz 0x0000000000000150 flags rw-
STACK off 0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**4
filesz 0x0000000000000000 memsz 0x0000000000000000 flags rw-
RELRO off 0x0000000000007e30 vaddr 0x0000000000207e30 paddr 0x0000000000207e30 align 2**0
filesz 0x00000000000001d0 memsz 0x00000000000001d0 flags r--
Dynamic Section:
SYMBOLIC 0x0000000000000000
INIT 0x00000000000002c0
FINI 0x0000000000006473
GNU_HASH 0x0000000000000158
STRTAB 0x00000000000001b0
SYMTAB 0x0000000000000180
STRSZ 0x0000000000000007
SYMENT 0x0000000000000018
DEBUG 0x0000000000000000
PLTGOT 0x0000000000207fb0
RELA 0x00000000000001b8
RELASZ 0x0000000000000108
RELAENT 0x0000000000000018
BIND_NOW 0x0000000000000000
FLAGS_1 0x0000000008000001
RELACOUNT 0x000000000000000b
readelf init -h
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: DYN (Shared object file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x1158
Start of program headers: 64 (bytes into file)
Start of section headers: 308248 (bytes into file)
Flags: 0x0
Why Type: DYN (Shared object file) ?
Trying to compile supermin under alpine, but src/Makefile.am#L159
require init is static linked by file command.
ET_DYN is used for position-independent executables (PIE), whether they are statically linked or not. The lack of a program interpreter and DT_NEEDED entries in the dynamic section indicate that the program is indeed statically linked. You can check that using readelf -l (no .interp) and readelf -d (no NEEDED).
Running the program under strace will also verify that no shared objects are loaded at program start.

Determining size of c++ method

I have a .cpp file with various methods defined:
// test.cpp
// foo() is not inlined
void foo() {
...
}
I compile it:
g++ test.cpp -o test.o -S
And now I want to determine, from examination of test.o, how many bytes of memory the instruction foo() takes up. How can I do this?
I ask because I have, through careful profiling and experimentation, determined that I am incurring instruction cache misses leading to significant slowdowns on certain critical paths. I'd therefore like to get a sense of how many bytes are occupied by various methods, to guide my efforts in shrinking my instruction set size.
I wouldn't recommend the -S flag for that, unless you're in love with your ISA's manual and hand-calculating instruction sizes. Instead, just build and disassemble, presto - out comes the answer:
$ cc -c example.c
$ objdump -d example.o
example.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <f>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: 89 7d fc mov %edi,-0x4(%rbp)
7: 8b 45 fc mov -0x4(%rbp),%eax
a: 83 c0 03 add $0x3,%eax
d: 5d pop %rbp
e: c3 retq
As you can see, this function is 15 bytes long.
Build with map file generation, parse the map file. There might be some padding in the end, but it'll give you an idea.
In g++, the option goes:
-Xlinker -Map=MyProject.txt

Converting assembly instructions to binary using objdump or gcc -c

I'm working on the buffer bomb lab and I'm stuck on one thing. I've written my exploit code to solve level 2 (firecracker) but I'm not sure how I can convert this to its raw form using gcc -c.
I've never written a compile-able assembly file but I have written instructions themselves and traced already written ones.. So I know how they work but I'm not sure how to syntactically write the code file itself.
Here are the current contents of the file I'm trying to convert to its raw form:
movl $0x1a4bb386, 0x804d200
push $0x0804915f
ret
What do I need to add to this so that it will compile using gcc -c or objdump -d?
I need to figure out how many bytes these instructions take up and how to insert them into the buffer so that I can write my buffer overflow exploit.
Thanks.
Compiled with gcc -m32 -c code.s
Output was a file code.o
I used objdump -d code.o > code.asm to obtain the raw assembly and bytes needed.
Output:
code.o: file format elf32-i386
Disassembly of section .text:
00000000 <.text>:
0: c7 05 00 d2 04 08 86 movl $0x1a4bb386,0x804d200
7: b3 4b 1a
a: 68 5f 91 04 08 push $0x804915f
f: c3 ret
I'm curious what the instruction does at VA 0x7 ... Is that just the rest of the instruction movl?

NASM and GDB Symbols: "Can't find any code sections in symbol file."

I am trying to get a simple example working from an assembly book I am reading. I am trying to get gdb to work with my simple assembly program that I am assembling with the NASM assembler. Below is the code, and the object file in elf format.
; Version : 1.0
; Created Date : 11/12/2011
; Last Update : 11/12/2011
; Author : Jeff Duntemann
; Description : A simple assembly app for Linux, using NASM 2.05,
; demonstrating the use of Linux INT 80H syscalls
; to display text.
; Build using these commands:
; nasm -f elf -g -F stabs eatsyscall.asm
; ld -o eatsyscall eatsyscall.o
;
SECTION .data ; Section containing initialized data
EatMsg: db "Eat at Joe's!",10
EatLen: equ $-EatMsg
SECTION .bss ; Section containing uninitialized data
SECTION .txt ; Section containing code
global _start ; Linker needs this to find the entry point!
_start:
nop ; This no_op keeps gdb happy (see text)
mov eax,4 ; Specify sys_write syscall
mov ebx,1 ; Specify File Descriptor 1: Standard Output
mov ecx,EatMsg ; Pass offset of the message
mov edx,EatLen ; Pass the length of the mesage
int 80H ; Make syscall to output the text to stdout
mov eax,1 ; Specify Exit syscall
mov ebx,0 ; Return a code of zero
int 80H ; Make syscall to terminate the program
and
mehoggan#mehoggan:~/Code/AsmWork/eatsyscall$ objdump -s ./eatsyscall.o
./eatsyscall.o: file format elf32-i386
Contents of section .data:
0000 45617420 6174204a 6f652773 210a Eat at Joe's!.
Contents of section .txt:
0000 90b80400 0000bb01 000000b9 00000000 ................
0010 ba0e0000 00cd80b8 01000000 bb000000 ................
0020 00cd80 ...
Contents of section .stab:
0000 00000000 64000100 00000000 ....d.......
Contents of section .stabstr:
0000 00
I am using the following command to assemble:
nasm -f elf -g -F stabs eatsyscall.asm
and I am using the following command to link:
ld -o eatsyscall eatsyscall.o
When I run GDB on the executable I get the following:
mehoggan#mehoggan:~/Code/AsmWork/eatsyscall$ gdb eatsyscall
GNU gdb (Ubuntu/Linaro 7.3-0ubuntu2) 7.3-2011.08
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-linux-gnu".
For bug reporting instructions, please see:
<http://bugs.launchpad.net/gdb-linaro/>...
Reading symbols from /home/mehoggan/Code/AsmWork/eatsyscall/eatsyscall...Can't find any code sections in symbol file
(gdb) quit
What do I need to do in addition to what I am doing above to get gdb to read the debug symbols specified to NASM with the -g flag?
FYI
mehoggan#mehoggan:~/Code/AsmWork/eatsyscall$ cat /etc/*release*
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=11.10
DISTRIB_CODENAME=oneiric
DISTRIB_DESCRIPTION="Ubuntu 11.10"
mehoggan#mehoggan:~/Code/AsmWork/eatsyscall$ uname -a
Linux mehoggan 3.0.0-12-generic-pae #20-Ubuntu SMP Fri Oct 7 16:37:17 UTC 2011 i686 i686 i386 GNU/Linux
mehoggan#mehoggan:~/Code/AsmWork/eatsyscall$
--Update--
Even if I take the 64 bit route with the following commands I am still having no success:
mehoggan#mehoggan:~/Code/AsmWork/eatsyscall$ nasm -f elf64 -g -F stabs eatsyscall.asm
mehoggan#mehoggan:~/Code/AsmWork/eatsyscall$ ld -o eatsyscall eatsyscall.o -melf_x86_64
mehoggan#mehoggan:~/Code/AsmWork/eatsyscall$ ./eatsyscall
bash: ./eatsyscall: cannot execute binary file
mehoggan#mehoggan:~/Code/AsmWork/eatsyscall$ objdump -s eatsyscall.o
eatsyscall.o: file format elf64-x86-64
Contents of section .data:
0000 45617420 6174204a 6f652773 210a Eat at Joe's!.
Contents of section .txt:
0000 9048b804 00000000 00000048 bb010000 .H.........H....
0010 00000000 0048b900 00000000 00000048 .....H.........H
0020 ba0e0000 00000000 00cd8048 b8010000 ...........H....
0030 00000000 0048bb00 00000000 000000cd .....H..........
0040 80 .
Contents of section .stab:
0000 00000000 64000100 00000000 ....d.......
Contents of section .stabstr:
0000 00 .
mehoggan#mehoggan:~/Code/AsmWork/eatsyscall$
How about using section .text istead of section .txt ?`
That is:
SECTION .data ; Section containing initialized data
EatMsg: db "Eat at Joe's!",10
EatLen: equ $-EatMsg
SECTION .bss ; Section containing uninitialized data
SECTION .text ; instead of .txt
Should work just fine afterwards in gdb and if you are on a x64 architecture, use the x64 flags.
I think I am having the same problem as you. Almost literally. I'm working through the same example in the Duntemann book. (The only difference in the source code is that I changed part of the string from Joe to Bob whilst trying to confirm that differences I made in the source code had the expected effect on the compiled executable.)
What I've discovered is slightly curious. I have two computers and am working alternately on each one using a synched dropbox directory. The older machine is running Ubuntu Karmic, because that has the best support for much of the stuff in Duntemann's book. I tried to put Karmic on the new machine as well, but some stuff simply won't install on it because it's no longer supported. So I'm running Ubuntu Oneiric on it instead.
Here's the thing. I can compile and run the exes on both machines. But anything compiled on the Oneiric machine seems to lack symbol information that gdb/kdbg/Insight is happy to work with. The stuff compiled on the Karmic machine works fine. And once built on the Karmic machine and synched with Dropbox, gdb/kdbg/Insight will run THAT exe fine on the Oneiric machine.
So the problem appears to be the compilation process on Oneiric. It is missing or changing something that messes up the ability of the debuggers to work with it properly.
Here is a dump of the Karmic object file:
$ cat karmic.txt
eatsyscall.o: file format elf32-i38
Contents of section .data:
0000 45617420 61742042 6f622773 210a Eat at Bob's!.
Contents of section .text:
0000 90b80400 0000bb01 000000b9 00000000 ................
0010 ba0e0000 00cd80b8 01000000 bb000000 ................
0020 00cd80 ...
Contents of section .comment:
0000 00546865 204e6574 77696465 20417373 .The Netwide Ass
0010 656d626c 65722032 2e30352e 303100 embler 2.05.01.
Contents of section .stab:
0000 01000000 00000a00 02000000 01000000 ................
0010 64000000 00000000 00000000 44001a00 d...........D...
0020 00000000 00000000 44001b00 01000000 ........D.......
0030 00000000 44001c00 06000000 00000000 ....D...........
0040 44001d00 0b000000 00000000 44001e00 D...........D...
0050 10000000 00000000 44001f00 15000000 ........D.......
0060 00000000 44002100 17000000 00000000 ....D.!.........
0070 44002200 1c000000 00000000 44002300 D.".........D.#.
0080 21000000 !...
Contents of section .stabstr:
0000 00656174 73797363 616c6c2e 61736d00 .eatsyscall.asm.
Here is a dump of the Oneiric object file:
$ cat oneiric.txt
eatsyscall.o: file format elf32-i386
Contents of section .data:
0000 45617420 61742042 6f622773 210a Eat at Bob's!.
Contents of section .text:
0000 90b80400 0000bb01 000000b9 00000000 ................
0010 ba0e0000 00cd80b8 01000000 bb000000 ................
0020 00cd80 ...
Contents of section .stab:
0000 01000000 00000b00 02000000 01000000 ................
0010 64000000 00000000 00000000 44001a00 d...........D...
0020 00000000 00000000 44001b00 01000000 ........D.......
0030 00000000 44001c00 06000000 00000000 ....D...........
0040 44001d00 0b000000 00000000 44001e00 D...........D...
0050 10000000 00000000 44001f00 15000000 ........D.......
0060 00000000 44002100 17000000 00000000 ....D.!.........
0070 44002200 1c000000 00000000 44002300 D.".........D.#.
0080 21000000 00000000 64000000 00000000 !.......d.......
Contents of section .stabstr:
0000 00656174 73797363 616c6c2e 61736d00 .eatsyscall.asm.
You can see that the two files are different (there are several extra bytes at the end of the non-working Oneiric file). And whatever nasm is doing on Oneiric is not playing properly with the debugger, it would seem.
Seems to work fine with current CVS version of GDB: "GNU gdb (GDB) 7.3.50.20111108-cvs", and also with GDB 7.2.
It sounds like "Ubuntu/Linaro 7.3-0ubuntu2" is broken in some way.
try using section/segment .text instead of .txt or .code :)