Segmentation Fault when linking c++ and assembly - c++

So I am trying to link a simple assembly script with c++, and so far no luck.
Assembly Script
section .data
global getebx
getebx:
mov eax, 0x0
cpuid
mov eax, ebx
ret
c++
#include <iostream>
extern "C" unsigned getebx();
int main (){
std::cout << (const char *)getebx()<< std::endl;
return 0;
}
And to build i am simply running the following commands.
nasm -f elf32 cpuidtest.asm
g++ -m32 -g main.cc cpuidtest.o
When I ran the executable I got a Segmentation Fault (Core dumped) error. So my next instinct was to take it to gdb. Here is what it returned:
program received signal SIGSEGV, Segmentation fault.
0xf7da0e86 in ?? () from /lib/i386-linux-gnu/libc.so.6
How can I fix this problem? Thank you in advance.

Calling convention mandates you must preserve some registers. In your case, that applies to ebx. You should modify your code to save and restore that, such as:
getebx:
push ebx
mov eax, 0x0
cpuid
mov eax, ebx
pop ebx
ret
Also, putting code into the .data section isn't the best idea ;)
Furthermore, ebx does not hold a string (a pointer to char) so you can not print it like that. It holds 4 characters, so something like this works better:
int main (){
unsigned ebx = getebx();
std::cout << std::string((char*)&ebx, 4) << std::endl;
return 0;
}

Related

How to translate assembly code typed by {} into ()

First of all... I am a total noob with assembly. I understand almost nothing. But this code which you are gonna see below works fine in Visual Studio. I just need to compile this to .o file using a simple g++ command.
g++ -o fileName.o filename.cpp
I need to translate assembly code written inside brackets {} to assembly written inside parentheses (). When I am trying to compile below code it crashes. Compiler suggest to use ( instead of {
unsigned char decode5a[0x0dac];
unsigned char* srcbuf = new unsigned char[4000];
m_image = new unsigned char[4000];
unsigned char* dstbuf = m_image;
__asm
{
lea eax, decode5a
push srcbuf
push dstbuf
call eax
add esp, 8
}
I tried something like that but it crash also. I think I am passing variable incorrectly.
__asm__(
"lea eax, decode5a \n
push srcbuf \n
push dstbuf \n
call eax \n
add esp, 8 \n
");
Here's how you would write that in gcc extended inline assembly, but this still may not work depending on what the function does.
In particular, any registers modified by the function have to be listed in the clobbers.
__asm__(
"push %1\n"
"push %2\n"
"call *%0\n"
"add $8, %%esp \n"
: : "r"(decode5a), "r"(srcbuf), "r"(dstbuf)
: "eax", "memory");

Extern C++ compiled fn in asm

I'm following an OS dev series by poncho on yt.
The 6th video linked C++ with assembly code using extern but the code was linked as C code as it was extern "C" void _start().
In ExtendedProgram.asm, _start was called like:
[extern _start]
Start64bit:
mov edi, 0xb8000
mov rax, 0x1f201f201f201f20
mov ecx, 500
rep stosq
call _start
jmp $
The Kernel.cpp had:
extern "C" void _start() {
return;
}
One of the comments in the video shows that for C++ a different name, _Z6_startv is
created.
So to try out I modified my Kernel.cpp as:
extern void _Z6_startv() { return; }
And also modified the ExtendedProgram.asm by replacing _start with _Z6_startv but the linker complained,
/usr/local/x86_64elfgcc/bin/x86_64-elf-ld: warning: cannot find entry symbol _start; defaulting to 0000000000008000
then I tried,
Kernel.cpp
extern "C++" void _Z6_startv() { return; } // I didn't even know wut i was doin'
And linker complained again.
I did try some other combinations & methods, all ending miserably, eventually landing here on Stack Overflow.
So, the question:
How to compile the function as a C++ function and link it to assembly?
there is a confusion between symbols:
The name of your function start will be mangled to _Z6_startv at compilation which means that the symbols that the linker (and your asm code) can use is _Z6_startv. mangling is what c++ compilers normaly do, but extern "C" tell the compiler to treat the function as if declared for a C program where no mangling happen so _start stay _start which means you do not need to change anything from the code you initialy showed.
or if you want to remove the extern "C"
what you want to do is:
[extern _Z6_startv]
Start64bit:
mov edi, 0xb8000
mov rax, 0x1f201f201f201f20
mov ecx, 500
rep stosq
call _Z6_startv
jmp $

Calling a standard-library-function in MASM

I want to get started in MASM in a mixed C++/Assembly way.
I am currently trying to call a standard-library-function (e.g. printf) from a PROC in assembly, that I then call in C++.
I have the code working after I declared printf's signature in my cpp-file. But I do not understand why I have to do this and if I can avoid that.
My cpp-file:
#include <stdio.h>
extern "C" {
extern int __stdcall foo(int, int);
}
extern int __stdcall printf(const char*, ...); // When I remove this line I get Linker-Error "LNK2019: unresolved external symbol"
int main()
{
foo(5, 5);
}
My asm-file:
.model flat, stdcall
EXTERN printf :PROC ; declare printf
.data
tstStr db "Mult: %i",0Ah,"Add: %i",0 ; 0Ah is the backslash - escapes are not supported
.code
foo PROC x:DWORD, y:DWORD
mov eax, x
mov ebx, y
add eax, ebx
push eax
mov eax, x
mul ebx
push eax
push OFFSET tstStr
call printf
ret
foo ENDP
END
Some Updates
In response to the comments I tried to rework the code to be eligible for the cdecl calling-convention. Unfortunatly this did not solve the problem (the code runs fine with the extern declaration, but throws an error without).
But by trial and error i found out, that the extern seems to force external linkage, even though the keyword should not be needed, because external linkage should be the default for function declarations.
I can omit the declaration by using the function in my cpp-code (i.e. if a add a printf("\0"); somewhere in the source file the linker is fine with it and everythings works correctly.
The new (but not really better) cpp-file:
#include <stdio.h>
extern "C" {
extern int __cdecl foo(int, int);
}
extern int __cdecl printf(const char*, ...); // omiting the extern results in a linker error
int main()
{
//printf("\0"); // this would replace the declaration
foo(5, 5);
return 0;
}
The asm-file:
.model flat, c
EXTERN printf :PROC
.data
tstStr db "Mult: %i",0Ah,"Add: %i",0Ah,0 ; 0Ah is the backslash - escapes are not supported
.code
foo PROC
push ebp
mov ebp, esp
mov eax, [ebp+8]
mov ebx, [ebp+12]
add eax, ebx
push eax
mov eax, [ebp+8]
mul ebx
push eax
push OFFSET tstStr
call printf
add esp, 12
pop ebp
ret
foo ENDP
END
My best guess is that this has to do with the fact that Microsoft refactored the C library starting with VS 2015 and some of the C library is now inlined (including printf) and isn't actually in the default .lib files.
My guess is in this declaration:
extern int __cdecl printf(const char*, ...);
extern forces the old legacy libraries to be included in the link process. Those libraries contain the non-inlined function printf. If the C++ code doesn't force the MS linker to include the legacy C library then the MASM code's use of printf will become unresolved.
I believe this is related to this Stackoverflow question and my answer in 2015. If you want to remove extern int __cdecl printf(const char*, ...); from the C++ code you may wish to consider adding this line to your MASM code:
includelib legacy_stdio_definitions.lib
Your MASM code would look like this if you are using CDECL calling convention and mixing C/C++ with assembly:
.model flat, C ; Default to C language
includelib legacy_stdio_definitions.lib
EXTERN printf :PROC ; declare printf
.data
tstStr db "Mult: %i",0Ah,"Add: %i",0 ; 0Ah is the backslash - escapes are not supported
.code
foo PROC x:DWORD, y:DWORD
mov eax, x
mov ebx, y
add eax, ebx
push eax
mov eax, x
mul ebx
push eax
push OFFSET tstStr
call printf
ret
foo ENDP
END
Your C++ code would be:
#include <stdio.h>
extern "C" {
extern int foo(int, int); /* __cdecl removed since it is the default */
}
int main()
{
//printf("\0"); // this would replace the declaration
foo(5, 5);
return 0;
}
The alternative to passing the includelib line in the assembly code is to add legacy_stdio_definitions.lib to the dependency list in the linker options of your Visual Studio project or the command line options if you invoke the linker manually.
Calling Convention Bug in your MASM Code
You can read about the CDECL calling convention for 32-bit Windows code in the Microsoft documentation as well as this Wiki article. Microsoft summarizes the CDECL calling convention as:
On x86 platforms, all arguments are widened to 32 bits when they are passed. Return values are also widened to 32 bits and returned in the EAX register, except for 8-byte structures, which are returned in the EDX:EAX register pair. Larger structures are returned in the EAX register as pointers to hidden return structures. Parameters are pushed onto the stack from right to left. Structures that are not PODs will not be returned in registers.
The compiler generates prologue and epilogue code to save and restore the ESI, EDI, EBX, and EBP registers, if they are used in the function.
The last paragraph is important in relation to your code. The ESI, EDI, EBX, and EBP registers are non-volatile and must be saved and restored by the called function if they are modified. Your code clobbers EBX, you must save and restore it. You can get MASM to do that by using the USES directive in a PROC statement:
foo PROC uses EBX x:DWORD, y:DWORD
mov eax, x
mov ebx, y
add eax, ebx
push eax
mov eax, x
mul ebx
push eax
push OFFSET tstStr
call printf
add esp, 12 ; Remove the parameters pushed on the stack for
; the printf call. The stack needs to be
; properly restored. If not done, the function
; prologue can't properly restore EBX
; (and any registers listed by USES)
ret
foo ENDP
uses EBX tell MASM to generate extra prologue and epilogue code to save EBX at the start and restore EBX when the function does a ret instruction. The generated instructions would look something like:
0000 _foo:
0000 55 push ebp
0001 8B EC mov ebp,esp
0003 53 push ebx
0004 8B 45 08 mov eax,0x8[ebp]
0007 8B 5D 0C mov ebx,0xc[ebp]
000A 03 C3 add eax,ebx
000C 50 push eax
000D 8B 45 08 mov eax,0x8[ebp]
0010 F7 E3 mul ebx
0012 50 push eax
0013 68 00 00 00 00 push tstStr
0018 E8 00 00 00 00 call _printf
001D 83 C4 0C add esp,0x0000000c
0020 5B pop ebx
0021 C9 leave
0022 C3 ret
That's indeed a bit pointless, isn't it?
Linkers are often pretty dumb things. They need to be told that an object file requires printf. Linkers can't figure that out from a missing printf symbol, stupidly enough.
The C++ compiler will tell the linker that it needs printf when you write extern int __stdcall printf(const char*, ...);. Or, and that's the normal way, the compiler will tell the linker so when you actually call printf. But your C++ code doesn't call it!
Assemblers are also pretty dumb. Your assembler clearly fails to tell the linker that it needs printf from C++.
The general solution is not to do complex things in assembly. That's just not what assembly is good for. Calls from C to assembly generally work well, calls the other way are problematic.

C++ inline assembly PROC ENDP error

I am trying to create procedure in assembly x86 inside a C++ program. My code is:
#include <stdio.h>
#include <stdlib.h>
int main(void){
_asm{
input1 PROC
push inputnumber
lea eax, inputmsg
push eax
call printf
add esp, 8
push ebx
lea eax, format
push eax
call scanf
add esp, 8
jmp check1
ret
input1 ENDP
}
}
However, when I try to compile the program with Visual studio I get the following error:
C2400 inline assembler syntax error in 'opcode'; found 'PROC'
C2400 inline assembler syntax error in 'opcode'; found 'ENDP'
I've read online but I cannot resolve it. Any suggestions how to fix it ?
Surprised that those are the only errors you get. PROC and ENDP are not recognized by the C inline assembler. Anyway, defining a function inside a function in C isn't a good idea. Try
int main(){
_asm{
push inputnumber
lea eax, inputmsg
:
call scanf
add esp, 8
ret
}
}
You will then end up with a whole bunch of undeclared variables and possibly warnings about scanf if you're using one of the MS compilers.

Inline assembly troubles

I tried to compile with GCC inline assembly code which compiled fine with MSVC, but got the following errors for basic operations:
// var is a template variable in a C++ function
__asm__
{
mov edx, var //error: Register name not specified for %edx
push ebx //error: Register name not specified for %ebx
sub esp, 8 //error: Register name not specified for %esp
}
After looking through documentation covering the topic, I found out that I should probably convert (even if I am only interested in x86) Intel style assembly code to AT&T style. However, after trying to use AT&T style I got even more weird errors:
mov var, %edx //error: Expected primary-expression before % token
mov $var, edx //error: label 'LASM$$s' used but not defined
I should also note that I tried to use LLVM-GCC, but it failed miserably with internal errors after encountering inline assembly.
What should I do?
For Apple's gcc you want -fasm-blocks which allows you to omit gcc's quoting requirement for inline asm and also lets you use Intel syntax.
// test_asm.c
int main(void)
{
int var;
__asm__
{
mov edx,var
push ebx
sub esp,8
}
return 0;
}
Compile this with:
$ gcc -Wall -m32 -fasm-blocks test_asm.c -o test_asm
Tested with gcc 4.2.1 on OS X 10.6.
g++ inline assembler is much more flexible than MSVC, and much more complicated. It treats an asm directive as a pseudo-instruction, which has to be described in the language of the code generator. Here is a working sample from my own code (for MinGW, not Mac):
// int BNASM_Add (DWORD* result, DWORD* a, int len)
//
// result += a
int BNASM_Add (DWORD* result, DWORD* a, int len)
{
int carry ;
asm volatile (
".intel_syntax\n"
" clc\n"
" cld\n"
"loop03:\n"
" lodsd\n"
" adc [edx],eax\n"
" lea edx,[edx+4]\n" // add edx,4 without disturbing the carry flag
" loop loop03\n"
" adc ecx,0\n" // Return the carry flag (ecx known to be zero)
".att_syntax\n"
: "=c"(carry) // Output: carry in ecx
: "d"(result), "S"(a), "c"(len) // Input: result in edx, a in esi, len in ecx
) ;
return carry ;
}
You can find documentation at http://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Extended-Asm.