Convert AT&T syntax to Intel Syntax (ASM) - c++

I've been trying to access the peb information of an executable as seen here: Access x64 TEB C++ & Assembly
The code works only in AT&T syntax for some odd reason but when I try to use Intel syntax, it fails to give the same value. There's of course an error on my part. So I'm asking..
How can I convert:
int main()
{
void* ptr = 0; //0x7fff5c4ff3c0
asm volatile
(
"movq %%gs:0x30, %%rax\n\t"
"movq 0x60(%%rax), %%rax\n\t"
"movq 0x18(%%rax), %%rax\n\t"
"movq %%rax, %0\n"
: "=r" (ptr) ::
);
}
to Intel Syntax?
I tried:
asm volatile
(
"movq rax, gs:[0x30]\n\t"
"movq rax, [rax + 0x60]\n\t"
"movq rax, [rax + 0x18]\n\t"
"movq rax, %0\n"
: "=r" (ptr) ::
);
and:
asm volatile
(
"mov rax, QWORD PTR gs:[0x30]\n\t"
"mov rax, QWORD PTR [rax + 0x60]\n\t"
"mov rax, QWORD PTR [rax + 0x18]\n\t"
"movq rax, %0\n" //mov rax, QWORD PTR [%0]\n
: "=r" (ptr) ::
);
They do not print the same value as the AT&T syntax: 0x7fff5c4ff3c0
Any ideas?

You forgot to reverse operand order on the last line. That said, the only instruction you need to have in asm is the first one due to the gs segment override, the rest could be done in C.

Related

Why is vzeroupper being inserted at the end of this code?

I noticed something strange when I compile this code on godbolt, with MSVC:
#include <intrin.h>
#include <cstdint>
void test(unsigned char*& pSrc) {
__m256i data = _mm256_loadu_si256(reinterpret_cast<const __m256i*>(pSrc));
int32_t mask = _mm256_movemask_epi8(data);
if (!mask) {
++pSrc;
}
else {
unsigned long v;
_BitScanForward(&v, mask);
pSrc += v;
}
}
I get this resulting assembly:
pSrc$ = 8
void test(unsigned char * &) PROC ; test, COMDAT
mov rdx, QWORD PTR [rcx]
vmovdqu ymm0, YMMWORD PTR [rdx]
vpmovmskb eax, ymm0
test eax, eax
jne SHORT $LN2#test
mov eax, 1
add rax, rdx
mov QWORD PTR [rcx], rax
vzeroupper ; Why is this being inserted?
ret 0
$LN2#test:
bsf eax, eax
add rax, rdx
mov QWORD PTR [rcx], rax
vzeroupper ; Why is this being inserted?
ret 0
void test(unsigned char * &) ENDP ; test
Why is vzeroupper being inserted at the end of each scope? I heard that it's because of switching between SSE and AVX, but I'm not doing that here. I'm using exclusively AVX code.
I was wondering, does this pose a performance problem?

inline assembly block with multiple outputs [duplicate]

This question already has answers here:
How to invoke a system call via syscall or sysenter in inline assembly?
(2 answers)
Unexpected GCC inline ASM behaviour (clobbered variable overwritten)
(1 answer)
When to use earlyclobber constraint in extended GCC inline assembly?
(2 answers)
inline assembly constraint for value that might be overwritten
(1 answer)
Closed 1 year ago.
How does one specify multiple outputs with an inline asm statement using gcc? I don't follow how the garbage value for ret is printed, but I suspect it's possibly related to both syscall and the mov at the top of the inline assembly section both writing to an output register.
Source:
#include <string.h>
#include <iostream>
int main() {
const char* str = "Hello World\n";
long len = strlen(str);
long ret = 0;
long test = 0;
__asm__ __volatile__ (
"mov $22, %0\n\t"
"movq $1, %%rax \n\t"
"movq $1, %%rdi \n\t"
"movq %2, %%rsi \n\t"
"movl %3, %%edx \n\t"
"syscall"
: "=r"(test), "=g"(ret)
: "g"(str), "g" (len));
std::cout << ret << "\n";
return 0;
}
Output:
Hello World
4202512
Disassembly
Dump of assembler code for function main():
0x0000000000401080 <+0>: sub $0x8,%rsp
0x0000000000401084 <+4>: mov $0x16,%rax
0x000000000040108b <+11>: mov $0x1,%rax
0x0000000000401092 <+18>: mov $0x1,%rdi
0x0000000000401099 <+25>: mov $0x402010,%rsi
0x00000000004010a0 <+32>: mov $0xc,%edx
0x00000000004010a5 <+37>: syscall
0x00000000004010a7 <+39>: mov $0x404080,%edi
0x00000000004010ac <+44>: callq 0x401040 <_ZNSo9_M_insertIlEERSoT_#plt>
0x00000000004010b1 <+49>: mov $0x1,%edx
0x00000000004010b6 <+54>: mov $0x40201b,%esi
0x00000000004010bb <+59>: mov %rax,%rdi
0x00000000004010be <+62>: callq 0x401050 <_ZSt16__ostream_insertIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_PKS3_l#plt>
0x00000000004010c3 <+67>: xor %eax,%eax
0x00000000004010c5 <+69>: add $0x8,%rsp
0x00000000004010c9 <+73>: retq

Error in simple g++ inline assembler

I'm trying to write a "hello world" program to test inline assembler in g++.
(still leaning AT&T syntax)
The code is:
#include <stdlib.h>
#include <stdio.h>
# include <iostream>
using namespace std;
int main() {
int c,d;
__asm__ __volatile__ (
"mov %eax,1; \n\t"
"cpuid; \n\t"
"mov %edx, $d; \n\t"
"mov %ecx, $c; \n\t"
);
cout << c << " " << d << "\n";
return 0;
}
I'm getting the following error:
inline1.cpp: Assembler messages:
inline1.cpp:18: Error: unsupported instruction `mov'
inline1.cpp:19: Error: unsupported instruction `mov'
Can you help me to get it done?
Tks
Your assembly code is not valid. Please carefully read on Extended Asm. Here's another good overview.
Here is a CPUID example code from here:
static inline void cpuid(int code, uint32_t* a, uint32_t* d)
{
asm volatile ( "cpuid" : "=a"(*a), "=d"(*d) : "0"(code) : "ebx", "ecx" );
}
Note the format:
first : followed by output operands: : "=a"(*a), "=d"(*d); "=a" is eax and "=b is ebx
second : followed by input operands: : "0"(code); "0" means that code should occupy the same location as output operand 0 (eax in this case)
third : followed by clobbered registers list: : "ebx", "ecx"
I kept #AMA answer as accepted one because it was complete enough. But I've put some thought on it and I concluded that it is not 100% correct.
The code I was trying to implement in GCC is the one below (Microsoft Visual Studio version).
int c,d;
_asm
{
mov eax, 1;
cpuid;
mov d, edx;
mov c, ecx;
}
When cpuid executes with eax set to 1, feature information is returned in ecx and edx.
The suggested code returns the values from eax ("=a") and edx (="d").
This can be easily seen at gdb:
(gdb) disassemble cpuid
Dump of assembler code for function cpuid(int, uint32_t*, uint32_t*):
0x0000000000000a2a <+0>: push %rbp
0x0000000000000a2b <+1>: mov %rsp,%rbp
0x0000000000000a2e <+4>: push %rbx
0x0000000000000a2f <+5>: mov %edi,-0xc(%rbp)
0x0000000000000a32 <+8>: mov %rsi,-0x18(%rbp)
0x0000000000000a36 <+12>: mov %rdx,-0x20(%rbp)
0x0000000000000a3a <+16>: mov -0xc(%rbp),%eax
0x0000000000000a3d <+19>: cpuid
0x0000000000000a3f <+21>: mov -0x18(%rbp),%rcx
0x0000000000000a43 <+25>: mov %eax,(%rcx) <== HERE
0x0000000000000a45 <+27>: mov -0x20(%rbp),%rax
0x0000000000000a49 <+31>: mov %edx,(%rax) <== HERE
0x0000000000000a4b <+33>: nop
0x0000000000000a4c <+34>: pop %rbx
0x0000000000000a4d <+35>: pop %rbp
0x0000000000000a4e <+36>: retq
End of assembler dump.
The code that generates something closer to what I want is (EDITED based on feedbacks on the comments):
static inline void cpuid2(uint32_t* d, uint32_t* c)
{
int a = 1;
asm volatile ( "cpuid" : "=d"(*d), "=c"(*c), "+a"(a) :: "ebx" );
}
The result is:
(gdb) disassemble cpuid2
Dump of assembler code for function cpuid2(uint32_t*, uint32_t*):
0x00000000000009b0 <+0>: push %rbp
0x00000000000009b1 <+1>: mov %rsp,%rbp
0x00000000000009b4 <+4>: push %rbx
0x00000000000009b5 <+5>: mov %rdi,-0x20(%rbp)
0x00000000000009b9 <+9>: mov %rsi,-0x28(%rbp)
0x00000000000009bd <+13>: movl $0x1,-0xc(%rbp)
0x00000000000009c4 <+20>: mov -0xc(%rbp),%eax
0x00000000000009c7 <+23>: cpuid
0x00000000000009c9 <+25>: mov %edx,%esi
0x00000000000009cb <+27>: mov -0x20(%rbp),%rdx
0x00000000000009cf <+31>: mov %esi,(%rdx)
0x00000000000009d1 <+33>: mov -0x28(%rbp),%rdx
0x00000000000009d5 <+37>: mov %ecx,(%rdx)
0x00000000000009d7 <+39>: mov %eax,-0xc(%rbp)
0x00000000000009da <+42>: nop
0x00000000000009db <+43>: pop %rbx
0x00000000000009dc <+44>: pop %rbp
0x00000000000009dd <+45>: retq
End of assembler dump.
Just to be clear... I know that there are better ways of doing it. But the purpose here is purely educational. Just want to understand how it works ;-)
-- edited (removed personal opinion) ---

Porting Inline GASM to x64 MASM Access Violation Issue

I am currently porting some code to MS Windows x64 from the https://github.com/mono project which was written for GCC Linux and I am having some challenges.
Currently I am unsure if my translation from x64 AT&T inline ASM to x64 MASM is correct. It compiles fine but my test case fails as memcpy throws exceptions/memory access violations after my ASM function executes. Is my translation correct?
One of the things I was really challenged by was the fact that rip is not accessible in Windows x64 MASM? I really don't know how to translate those remaining lines of the AT&T syntax (see below). But I gave it a best try. Did I handle the lack of rip access correctly?
If my work is correct then why is memcpy failing?
Here is the related C++:
void mono_context_get_current(MonoContext cnt); //declare the ASM func
//Pass the static struct pointer to the ASM function mono_context_get_current
//The purpose here is to clobber it
#ifdef _MSC_VER
#define MONO_CONTEXT_GET_CURRENT(ctx) do { \
mono_context_get_current(ctx); \
} while (0)
#endif
static MonoContext cur_thread_ctx = {0};
MONO_CONTEXT_GET_CURRENT (cur_thread_ctx);
memcpy (&info->ctx, &cur_thread_ctx, sizeof (MonoContext)); //memcpy throws Exception.
Here is the current ASM function.
mono_context_get_current PROTO
.code
mono_context_get_current PROC
mov rax, rcx ;Assume that rcx contains the pointer being passed
mov [rax+00h], rax
mov [rax+08h], rbx
mov [rax+10h], rcx
mov [rax+18h], rdx ;purpose is to offset from my understanding of the GCC assembly
mov [rax+20h], rbp
mov [rax+28h], rsp
mov [rax+30h], rsi
mov [rax+38h], rdi
mov [rax+40h], r8
mov [rax+48h], r9
mov [rax+50h], r10
mov [rax+58h], r11
mov [rax+60h], r12
mov [rax+68h], r13
mov [rax+70h], r14
mov [rax+78h], r15
call $ + 5
mov rdx, [rax+80h]
pop rdx
mono_context_get_current ENDP
END
To my understanding the rcx register should contain the struct pointer and that I should be using rdx to pop.
As I mentioned I have GCC ASM for non-Win64 platforms which appears to work on those platforms. This is what that code looks like:
#define MONO_CONTEXT_GET_CURRENT(ctx) \
__asm__ __volatile__( \
"movq $0x0, 0x00(%0)\n" \
"movq %%rbx, 0x08(%0)\n" \
"movq %%rcx, 0x10(%0)\n" \
"movq %%rdx, 0x18(%0)\n" \
"movq %%rbp, 0x20(%0)\n" \
"movq %%rsp, 0x28(%0)\n" \
"movq %%rsi, 0x30(%0)\n" \
"movq %%rdi, 0x38(%0)\n" \
"movq %%r8, 0x40(%0)\n" \
"movq %%r9, 0x48(%0)\n" \
"movq %%r10, 0x50(%0)\n" \
"movq %%r11, 0x58(%0)\n" \
"movq %%r12, 0x60(%0)\n" \
"movq %%r13, 0x68(%0)\n" \
"movq %%r14, 0x70(%0)\n" \
"movq %%r15, 0x78(%0)\n" \
"leaq (%%rip), %%rdx\n" \
"movq %%rdx, 0x80(%0)\n" \
: \
: "a" (&(ctx)) \
: "rdx", "memory")
Thanks for any help you may be able to offer! I'll be the first to admit my assembly is pretty rusty.
You can let gcc create the asm file for you (gcc can produce MASM syntax as well):
gcc -S -masm=intel myfile.c
Comparing between the two versions there appears to be some discrepancy:
movq $0x0, 0x00(%0)
It doesn't look like rax is being saved but instead that memory slot is zero'ed out.
leaq (%%rip), %%rdx
You should be able to translate that into intel synatx:
lea rdx, [rip]
which is valid if you're using 64-bit relative addressing mode.
And this line is incorrectly translated from att:
call $ + 5
mov rdx, [rax+80h] ; looks reversed
pop rdx
Here's how I've translated the original gas syntax above:
mov qword ptr [rcx], 0
mov [rcx + 0x08], rbx
mov [rcx + 0x10], rax
mov [rcx + 0x18], rdx
mov [rcx + 0x20], rbp
mov [rcx + 0x28], rsp
mov [rcx + 0x30], rsi
mov [rcx + 0x38], rdi
mov [rcx + 0x40], r8
mov [rcx + 0x48], r9
mov [rcx + 0x50], r10
mov [rcx + 0x58], r11
mov [rcx + 0x60], r12
mov [rcx + 0x68], r13
mov [rcx + 0x70], r14
mov [rcx + 0x78], r15
lea rdx, [rip]
mov [rcx + 0x80], rdx
mov rdx, [rcx + 0x18] ; restore old rdx since it's on clobber list
Note that I switched rcx around with rax just to save an extra mov. So rax gets saved in place of rcx in the gas syntax. You might need to modify this depending on your invariants.
If it still crashes I'd advise stepping through it with a debugger.

How to read registers: RAX, RBX, RCX, RDX, RSP. RBP, RSI, RDI in C or C++? [duplicate]

This question already has an answer here:
How can you pull a value from a register?
(1 answer)
Closed 9 years ago.
Lets say I want to read values from those registers (and pretty all thats it) on dual core x64 CPU. How can I do this? Can I simply write something like:
uint64_t rax = 0, rbx = 0;
__asm__ __volatile__ (
/* read value from rbx into rbx */
"movq %%rdx, %0;\n"
/* read value from rax into rax*/
"movq %%rax, %1;\n"
/* output args */
: "=r" (rbx), "=r" (rax)
: /* no input */
/* clear both rdx and rax */
: "%rdx", "%rax"
);
and then just print out rax and rbx? Cheers
The right way to do this with gcc is with register contraints:
uint64_t rax = 0, rbx = 0;
__asm__("" : "=a"(rax), "=b"(rbx) ::); /* make rax and rbx take on the current values in those registers */
Note that you don't need any actual instructions -- the constraints tell gcc that after doing nothing, the value rax will be in rax and the value of rbx will be in rbx.
You can use the constraints a, b, c, d, S, and D (the latter two are for %rsi and %rdi). You can also use Yz for %xmm0. Unfortunately, there don't seem to be constraints for other specific registers.