gdb can't step, but can break on lines nasm - gdb

I have a file compiled with nasm, with nasm -f elf64 -g helloworld.asm, and here is the output of objdump -g -d -M intel helloworld.o:
helloworld.e: file format elf64-x86-64
Disassembly of section .text:
00000000004000b0 <_start>:
4000b0: b8 04 00 00 00 mov eax,0x4
4000b5: bb 01 00 00 00 mov ebx,0x1
4000ba: 48 b9 d8 00 60 00 00 movabs rcx,0x6000d8
4000c1: 00 00 00
4000c4: ba 0b 00 00 00 mov edx,0xb
4000c9: cd 80 int 0x80
4000cb: b8 01 00 00 00 mov eax,0x1
4000d0: bb 00 00 00 00 mov ebx,0x0
4000d5: cd 80 int 0x80
helloworld.asm:
/* file helloworld.asm line 9 addr 0x4000b0 */
/* file helloworld.asm line 10 addr 0x4000b5 */
/* file helloworld.asm line 11 addr 0x4000ba */
/* file helloworld.asm line 12 addr 0x4000c4 */
/* file helloworld.asm line 13 addr 0x4000c9 */
/* file helloworld.asm line 14 addr 0x4000cb */
/* file helloworld.asm line 15 addr 0x4000d0 */
/* file helloworld.asm line 16 addr 0x4000d5 */
so, at least to me, it looks like it has debug information. when I run gdb, I can set a breakpoint at any line, and it breaks at the proper memory address [and every register is updated as expected] yet I cannot step, since I get the horrible
Single stepping until exit from function _start,
which has no line number information.
My gdb version is 7.7.1 and nasm version is 2.10.9.
Anyone has any idea?

It seems to be, in your case, the problem is version skew between nasm and gdb.
Try to update your gdb version at least to 7.8.

Related

How to read sanitizer errors correctly?

I wrote a small client-server application in c++ (although there is a lot of C style). I have asan installed to build on macos, but it doesn't give any errors, however when I run the same test in the docker on ubuntu, I get a message from the sanitizer.
I would like to fix the errors, but I just don't understand what they could be caused by. I don't know how I can see where the error is via the byte address.
# ./single_client.sh
clang -shared -fPIC -ldl -O3 -o monkey.so monkey.c
clang++ single_client.cpp -std=c++17 -g -O3 -Werror -Wall -Wextra -pthread -pedantic -o single_client
clang++ multiple_client.cpp -std=c++17 -g -O3 -Werror -Wall -Wextra -pthread -pedantic -o multiple_client
clang++ random_clients.cpp -std=c++17 -g -O3 -Werror -Wall -Wextra -pthread -pedantic -o random_clients
clang++ simple_server.cpp -std=c++17 -g -O3 -Werror -Wall -Wextra -pthread -pedantic -o simple_server
clang++ server.cpp -std=c++17 -g -O3 -g -fsanitize=address -Werror -Wall -Wextra -pthread -pedantic -o server
clang++ client.cpp -std=c++17 -g -O3 -g -fsanitize=address -Werror -Wall -Wextra -pthread -pedantic -o client
kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec]
[TEST] Send
[TEST] Human readeable: 4 alex
[TEST] Send binary hex:
00000004 616c6578
[TEST] Send
[TEST] Human readeable: 11 hello world
[TEST] Send binary hex:
0000000b 68656c6c 6f20776f 726c64
[TEST] Read
=================================================================
==61==ERROR: AddressSanitizer: unknown-crash on address 0xffffd05639c0 at pc 0x00000050a6a4 bp 0xffffabcfe8f0 sp 0xffffabcfe908
READ of size 8 at 0xffffd05639c0 thread T1
#0 0x50a6a3 (/mess/my_data/artifacts/server+0x50a6a3)
#1 0xffffaf87d087 (/lib/aarch64-linux-gnu/libpthread.so.0+0x7087)
Address 0xffffd05639c0 is located in stack of thread T0 at offset 192 in frame
#0 0x50a07b (/mess/my_data/artifacts/server+0x50a07b)
This frame has 6 object(s):
[32, 36) 'clilen' (line 24)
[48, 64) 'serv_addr' (line 25)
[80, 96) 'cli_addr' (line 25)
[112, 160) 'm1' (line 27)
[192, 200) 'newsockfd' (line 59) <== Memory access at offset 192 is inside this variable
[224, 232) 'thread' (line 61)
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
(longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: unknown-crash (/mess/my_data/artifacts/server+0x50a6a3)
Shadow bytes around the buggy address:
0x200ffa0ac6e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x200ffa0ac6f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x200ffa0ac700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x200ffa0ac710: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x200ffa0ac720: f1 f1 f1 f1 04 f2 00 00 f2 f2 00 00 f2 f2 00 00
=>0x200ffa0ac730: 00 00 00 00 f2 f2 f2 f2[00]f2 f2 f2 f8 f3 f3 f3
0x200ffa0ac740: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x200ffa0ac750: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x200ffa0ac760: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x200ffa0ac770: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x200ffa0ac780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Thread T1 created by T0 here:
#0 0x4441ab (/mess/my_data/artifacts/server+0x4441ab)
#1 0x50a33f (/mess/my_data/artifacts/server+0x50a33f)
#2 0xffffaf6ed79f (/lib/aarch64-linux-gnu/libc.so.6+0x2079f)
#3 0x41f697 (/mess/my_data/artifacts/server+0x41f697)
==61==ABORTING
terminate called after throwing an instance of 'std::runtime_error'
what(): could not read message from server: Connection reset by peer
single_client_impl.sh: line 23: 63 Aborted ./simple-messanger-tests/src/single_client 8081
single_client_impl.sh: line 1: kill: (61) - No such process
P.s
I thought the sanitizer was pointing me to line 59 of the code, but I didn't find anything unusual:
size_t newsockfd = accept(sockfd, (struct sockaddr *)&cli_addr, &clilen);

Is there a way to see what's inside a ".rodata+(memory location)" in an object file?

So I'm taking a class where I am given a single object file and need to reverse engineer it into c++ code. The command I'm told to use is "gdb assignment6_1.o" to open it in gdb, and "disass main" to see assembly code.
I'm also using "objdump -dr assignment6_1.o" myself since it outputs a little more information.
The problem I'm running into, is that using objdump, I can see that the program is trying to access what I believe is a variable or maybe a string, ".rodata+0x41". There are multiple .rodata's, that's just one example.
Is there a command or somewhere I can look to see what that's referencing? I also have access to the "Bless" program.
Below is a snippet of the disassembled code I have.
a3: 48 8d 35 00 00 00 00 lea 0x0(%rip),%rsi # aa <main+0x31>
a6: R_X86_64_PC32 .rodata+0x41
aa: 48 8d 3d 00 00 00 00 lea 0x0(%rip),%rdi # b1 <main+0x38>
ad: R_X86_64_PC32 _ZSt4cout-0x4
b1: e8 00 00 00 00 callq b6 <main+0x3d>
b2: R_X86_64_PLT32 _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc-0x4
b6: 48 8d 35 00 00 00 00 lea 0x0(%rip),%rsi # bd <main+0x44>
b9: R_X86_64_PC32 .rodata+0x53
bd: 48 8d 3d 00 00 00 00 lea 0x0(%rip),%rdi # c4 <main+0x4b>
c0: R_X86_64_PC32 _ZSt4cout-0x4
c4: e8 00 00 00 00 callq c9 <main+0x50>
c5: R_X86_64_PLT32 _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc-0x4
c9: 48 8d 35 00 00 00 00 lea 0x0(%rip),%rsi # d0 <main+0x57>
cc: R_X86_64_PC32 .rodata+0x5e
d0: 48 8d 3d 00 00 00 00 lea 0x0(%rip),%rdi # d7 <main+0x5e>
d3: R_X86_64_PC32 _ZSt4cout-0x4
d7: e8 00 00 00 00 callq dc <main+0x63>
d8: R_X86_64_PLT32 _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc-0x4
dc: 48 8d 35 00 00 00 00 lea 0x0(%rip),%rsi # e3 <main+0x6a>
df: R_X86_64_PC32 .rodata+0x6e
e3: 48 8d 3d 00 00 00 00 lea 0x0(%rip),%rdi # ea <main+0x71>
e6: R_X86_64_PC32 _ZSt4cout-0x4
ea: e8 00 00 00 00 callq ef <main+0x76>
eb: R_X86_64_PLT32 _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc-0x4```
Is there a way to see what's inside a ".rodata+(memory location)" in an object file?
Sure. Both objdump and readelf can dump contents of any section.
Example:
// x.c
#include <stdio.h>
int foo() { return printf("AA.\n") + printf("BBBB.\n"); }
gcc -c x.c
objdump -dr x.o
...
9: 48 8d 05 00 00 00 00 lea 0x0(%rip),%rax # 10 <foo+0x10>
c: R_X86_64_PC32 .rodata-0x4
...
1f: 48 8d 05 00 00 00 00 lea 0x0(%rip),%rax # 26 <foo+0x26>
22: R_X86_64_PC32 .rodata+0x1
...
Note that because the RIP used in these instructions is the address of the next instruction, the actual data we care about is at .rodata+0 and .rodata+5 (in your original disassembly, you care about .rodata+45, not .rodata+41).
So what's there?
objdump -sj.rodata x.o
x.o: file format elf64-x86-64
Contents of section .rodata:
0000 41412e0a 00424242 422e0a00 AA...BBBB...
or, using readelf:
readelf -x .rodata x.o
Hex dump of section '.rodata':
0x00000000 41412e0a 00424242 422e0a00 AA...BBBB...

Why does C++ inline function has call instructions?

I read that with inline functions where ever the function call is made we replace the function call with the body of the function definition.
According to the above explanation there should not be any function call when inline is user.
If that is the case Why do I see three call instructions in the assembly code ?
#include <iostream>
inline int add(int x, int y)
{
return x+ y;
}
int main()
{
add(8,9);
add(20,10);
add(100,233);
}
meow#vikkyhacks ~/Arena/c/temp $ g++ -c a.cpp
meow#vikkyhacks ~/Arena/c/temp $ objdump -M intel -d a.o
0000000000000000 <main>:
0: 55 push rbp
1: 48 89 e5 mov rbp,rsp
4: be 09 00 00 00 mov esi,0x9
9: bf 08 00 00 00 mov edi,0x8
e: e8 00 00 00 00 call 13 <main+0x13>
13: be 0a 00 00 00 mov esi,0xa
18: bf 14 00 00 00 mov edi,0x14
1d: e8 00 00 00 00 call 22 <main+0x22>
22: be e9 00 00 00 mov esi,0xe9
27: bf 64 00 00 00 mov edi,0x64
2c: e8 00 00 00 00 call 31 <main+0x31>
31: b8 00 00 00 00 mov eax,0x0
36: 5d pop rbp
37: c3 ret
NOTE
Complete dump of the object file is here
You did not optimize so the calls are not inlined
You produced an object file (not a .exe) so the calls are not resolved. What you see is a dummy call whose address will be filled by the linker
If you compile a full executable you will see the correct addresses for the jumps
See page 28 of:
http://www.cs.princeton.edu/courses/archive/spr04/cos217/lectures/Assembler.pdf

TAR file format issue

It is unclear to me, what is a correct .tar file format, as I am experiencing proper functionality with three scenarios (see below).
Based on .tar specification I have been working with, the magic field (ustar) is null-terminated character string and version field is octal number with no trailing nulls.
However I've review several .tar files I found on my server and I found different implementation of magic and version field and all three of them seems to work properly, probably because system ignore those fields.
See different (3) bytes between words ustar and root in the following examples >>
Scenario 1 (20 20 00):
000000F0 00 00 00 00 | 00 00 00 00 | 00 00 00 00 ............
000000FC 00 00 00 00 | 00 75 73 74 | 61 72 20 20 .....ustar
00000108 00 72 6F 6F | 74 00 00 00 | 00 00 00 00 .root.......
00000114 00 00 00 00 | 00 00 00 00 | 00 00 00 00 ............
Scenario 2 (00 20 20):
000000F0 00 00 00 00 | 00 00 00 00 | 00 00 00 00 ............
000000FC 00 00 00 00 | 00 75 73 74 | 61 72 00 20 .....ustar.
00000108 20 72 6F 6F | 74 00 00 00 | 00 00 00 00 root.......
00000114 00 00 00 00 | 00 00 00 00 | 00 00 00 00 ............
Scenario 3 (00 00 00):
000000F0 00 00 00 00 | 00 00 00 00 | 00 00 00 00 ............
000000FC 00 00 00 00 | 00 75 73 74 | 61 72 00 00 .....ustar..
00000108 00 72 6F 6F | 74 00 00 00 | 00 00 00 00 .root.......
00000114 00 00 00 00 | 00 00 00 00 | 00 00 00 00 ............
Which one is the correct format?
In my opinion none of your examples is the correct one, at least not for the POSIX format.
As you can read here:
/* tar Header Block, from POSIX 1003.1-1990. */
/* POSIX header */
struct posix_header { /* byte offset */
char name[100]; /* 0 */
char mode[8]; /* 100 */
char uid[8]; /* 108 */
char gid[8]; /* 116 */
char size[12]; /* 124 */
char mtime[12]; /* 136 */
char chksum[8]; /* 148 */
char typeflag; /* 156 */
char linkname[100]; /* 157 */
char magic[6]; /* 257 */
char version[2]; /* 263 */
char uname[32]; /* 265 */
char gname[32]; /* 297 */
char devmajor[8]; /* 329 */
char devminor[8]; /* 337 */
char prefix[155]; /* 345 */
};
#define TMAGIC "ustar" /* ustar and a null */
#define TMAGLEN 6
#define TVERSION "00" /* 00 and no null */
#define TVERSLEN 2
The format of your first example (Scenario 1) seems to be matching with the old GNU header format:
/* OLDGNU_MAGIC uses both magic and version fields, which are contiguous.
Found in an archive, it indicates an old GNU header format, which will be
hopefully become obsolescent. With OLDGNU_MAGIC, uname and gname are
valid, though the header is not truly POSIX conforming */
#define OLDGNU_MAGIC "ustar " /* 7 chars and a null */
In both your second and third examples (Scenario 2 and Scenario 3), the version field is set to an unexpected value (according to the above documentation, the correct value should be 00 ASCII or 0x30 0x30 hex), so this field is most likely ignored.
With Fedora 18 if I execute this command:
tar --format=posix -cvf testPOSIX.tar test.txt
I have a POSIX tar file format with: ustar\0 (0x757374617200)
else if I execute this:
tar --format=gnu -cvf testGNU.tar test.txt
I have a GNU tar file format with: ustar 0x20 0x20 0x00 (0x7573746172202000) (old gnu format)
From /usr/share/magic file:
# POSIX tar archives
257 string ustar\0 POSIX tar archive
!:mime application/x-tar # encoding: posix
257 string ustar\040\040\0 GNU tar archive
!:mime application/x-tar # encoding: gnu
0x20 is 40 in octal.
I've also tried to edit the hex code with:
00 20 20
and however the tar worked correctly. I've exctract test.txt without problem.
but when I've tried to edit the hex code with:
00 00 00
The tar was not recognized.
So, my conclusion is that the correct format is:
20 20 00

Accessing specific binary information based on binary format documentation

I have a binary file and documentation of the format the information is stored in. I'm trying to write a simple program using c++ that pulls a specific piece of information from the file but I'm missing something since the output isn't what I expect.
The documentation is as follows:
Half-word Field Name Type Units Range Precision
10 Block Divider INT*2 N/A -1 N/A
11-12 Latitude INT*4 Degrees -90 to +90 0.001
There are other items in the file obviously but for this case I'm just trying to get the Latitude value.
My code is:
#include <cstdlib>
#include <iostream>
#include <fstream>
using namespace std;
int main(int argc, char* argv[])
{
char* dataFileLocation = "testfile.bin";
ifstream dataFile(dataFileLocation, ios::in | ios::binary);
if(dataFile.is_open())
{
char* buffer = new char[32768];
dataFile.seekg(10, ios::beg);
dataFile.read(buffer, 4);
dataFile.close();
cout << "value is << (int)(buffer[0] & 255);
}
}
The result of which is "value is 226" which is not in the allowed range.
I'm quite new to this and here's what my intentions where when writing the above code:
Open file in binary mode
Seek to the 11th byte from the start of the file
Read in 4 bytes from that point
Close the file
Output those 4 bytes as an integer.
If someone could point out where I'm going wrong I'd sure appreciate it. I don't really understand the (buffer[0] & 255) part (took that from some example code) so layman's terms for that would be greatly appreciated.
Hex Dump of the first 100 bytes:
testfile.bin 98,402 bytes 11/16/2011 9:01:52
-0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -A -B -C -D -E -F
00000000- 00 5F 3B BF 00 00 C4 17 00 00 00 E2 2E E0 00 00 [._;.............]
00000001- 00 03 FF FF 00 00 94 70 FF FE 81 30 00 00 00 5F [.......p...0..._]
00000002- 00 02 00 00 00 00 00 00 3B BF 00 00 C4 17 3B BF [........;.....;.]
00000003- 00 00 C4 17 00 00 00 00 00 00 00 00 80 02 00 00 [................]
00000004- 00 05 00 0A 00 0F 00 14 00 19 00 1E 00 23 00 28 [.............#.(]
00000005- 00 2D 00 32 00 37 00 3C 00 41 00 46 00 00 00 00 [.-.2.7.<.A.F....]
00000006- 00 00 00 00 [.... ]
Since the documentation lists the field as an integer but shows the precision to be 0.001, I would assume that the actual value is the stored value multiplied by 0.001. The integer range would be -90000 to 90000.
The 4 bytes must be combined into a single integer. There are two ways to do this, big endian and little endian, and which you need depends on the machine that wrote the file. x86 PCs for example are little endian.
int little_endian = buffer[0] | buffer[1]<<8 | buffer[2]<<16 | buffer[3]<<24;
int big_endian = buffer[0]<<24 | buffer[1]<<16 | buffer[2]<<8 | buffer[3];
The &255 is used to remove the sign extension that occurs when you convert a signed char to a signed integer. Use unsigned char instead and you probably won't need it.
Edit: I think "half-word" refers to 2 bytes, so you'll need to skip 20 bytes instead of 10.