In my ongoing experimentation with GCC inline assembly, I've run into a new problem regarding labels and inlined code.
Consider the following simple jump:
__asm__
(
"jmp out;"
"out:;"
:
:
);
This does nothing except jump to the out label. As is, this code compiles fine. But if you place it inside a function, and then compile with optimization flags, the compiler complains: "Error: symbol 'out' is already defined".
What seems to be happening is that the compiler is repeating this assembly code every time it inlines the function. This causes the label out to get duplicated, leading to multiple out labels.
So, how do I work around this? Is it really not possible to use labels in inline assembly? This tutorial on GCC inline assembly mentions that:
Thus, you can make put your assembly
into CPP macros, and inline C
functions, so anyone can use it in as
any C function/macro. Inline functions
resemble macros very much, but are
sometimes cleaner to use. Beware that
in all those cases, code will be
duplicated, so only local labels (of
1: style) should be defined in that
asm code.
I tried to find more information about these "local labels", but can't seem to find anything relating to inline assembly. It looks like the tutorial is saying that a local label is a number followed by a colon, (like 1:), so I tried using a label like that. Interestingly, the code compiled, but at run time it simply triggered a Segmentation Fault. Hmm...
So any suggestions, hints, answers...?
A declaration of a local label is indeed a number followed by a colon. But a reference to a local label needs a suffix of f or b, depending on whether you want to look forwards or backwards - i.e. 1f refers to the next 1: label in the forwards direction.
So declaring the label as 1: is correct; but to reference it, you need to say jmp 1f (because you are jumping forwards in this case).
Well, this question isn't getting any younger, but there are two other interesting solutions.
1) This example uses %=. %= in an assembler template is replaced with a number that is "unique to each insn in the entire compilation. This is useful for making local labels that are referred to more than once in a given insn." Note that to use %=, you (apparently) must have at least one input (although you probably don't have to actually use it).
int a = 3;
asm (
"test %0\n\t"
"jnz to_here%=\n\t"
"jz to_there%=\n\t"
"to_here%=:\n\t"
"to_there%=:"
::"r" (a));
This outputs:
test %eax
jnz to_here14
jz to_there14
to_here14:
to_there14:
Alternately, you can use the asm goto (Added in v4.5 I think). This actually lets you jump to c labels instead of just asm labels:
asm goto ("jmp %l0\n"
: /* no output */
: /* no input */
: /* no clobber */
: gofurther);
printf("Didn't jump\n");
// c label:
gofurther:
printf("Jumped\n");
Related
This is the beginning of a function that already exists and works; the commented line is my addition and its purpose is to toggle a pin.
inline __attribute__((naked))
void CScheduler::SwapToThread(void* pNew, void* pPrev)
{
//*(volatile DWORD*)0x400FF08C = (1 << 14);
if (pPrev != NULL)
{
if (pPrev == this) // Special case to save scheduler stack on startup
{
asm("mov lr,%0"::"p"(&CScheduler_Run_Exit)); // load r1 with schedulers End thread
asm("orr lr, 1");
When I uncomment my addition, my hard fault handler executes. I get it has something to do with this being a naked function but I don't understand why a simple assignment causes a problem.
Two questions:
Why does this line trigger the hard fault?
How can I perform this assignment inside this function?
It was only luck that your previous version of the function happened to work without crashing.
The only thing that can safely be put inside a naked function is a pure Basic Asm statement. https://gcc.gnu.org/onlinedocs/gcc/ARM-Function-Attributes.html. You can split it up into multiple Basic Asm statements, instead of asm("insn \n\t" / "insn2 \n\t" / ...);, but you have to write the entire function in asm yourself.
While using extended asm or a mixture of basic asm and C code may appear to work, they cannot be depended upon to work reliably and are not supported.
If you want to run C++ code from a naked function, you could call a regular function (or bl on ARM, jal on MIPS, etc.), following to the standard calling convention.
As for the specific reason in this case? Maybe creating that address in a register stepped on the function args, leading to the branches going wrong? Inspect the generated asm if you want, but it's 100% unsupported.
Or maybe it ended up using more registers, and since it's naked didn't properly save/restore call-preserved registers? I haven't looked at the code-gen myself for naked functions.
Are you sure this function needs to be naked? I guess that's because you manipulate lr to return to the new context.
If you don't want to just write more logic in asm, maybe have this function's caller do more work (and maybe pass it pointer and/or boolean args telling it more simply what it needs to do, so your inputs are already in registers, and you don't need to access globals).
I have a function prototype int Palindrome(const char *c_style_string);
In ARM v8 assembly, I believe that the parameter is stored in register w0. However, isn't this also the register that ret outputs the value of?
If so, what do I need to do so that values do not get overwritten? I was thinking something like mov w0, w1 at the beginning of my code so that I refer to c_style_string as w1 whenever I parse through it, and then edit w0 to store an int...would this be right?
Thank you!
You may want to write your assembly code in compliance with the ABI for ARM 64-bit Architecture.
In the example above, you could keep the address for c_style_string in a 'Callee-saved' register (X19-X29)', and copy it to x0/w0 every time you are calling a Palindrome() - I am assuming here Palindrome() is a C function, and is therefore itself compliant with the ARCH 64-bit ABI.
A desirable side-effect would be that your C code could call always your assembly code, and vice-versa.
IMHO, your best solution is to write the C function, or minimal function, then tell the compiler to output the assembly language. This will show the calling interface for functions.
You could also look up the register passing convention in your compiler's documentation.
If you want to preserve register values, you should use the PUSH instruction (or it's equivalent, depending on ARM mode or Thumb mode). Also remember to POP the registers before the end of the function.
Following up on this question, is it possible for llvm to generate code that may jump to an arbitrary address within a function in the same address space?
i.e.
void func1() {
...
<code that jumps to addr2>
...
}
void func2() {
...
addr2:
<some code in func2()>
...
}
Yes,No,Yes,No,(yes) - It depends on the level you look at and what you mean with possible:
Yes, as the llvm backend will produce target specific assembler
instructions and those assembler instructions allow to set the
program counter to an abitrary value.
No, because - as far as I know - the llvm ir (the intermediate representation into which a frontend like clang compiles your c code) hasn't any instructions that would allow abitrary jumps between (llvm-ir) functions.
Yes, because the frontend COULD certainly produce code, that simulates that behaviour (breaking up func2 into multiple separate functions).
No, because C and C++ don't allow such jumps to ARBITRARY positions and so clang will not compile any program that tries to do that (e.g. via goto)
(yes) the c longjmp macro jumps back to a place in the control flow that you have already visited (where you called setjmp) but also restores (most) of the system state. EDIT: However, this is UB if func2 isn't somewhere up in the current callstack from where you jump.
im working on a hook in C++ and ASM and currently i have just made an easy inline hook that places a jump in the first instruction of the target function which in this case is OutputDebugString just for testing purposes.
the thing is that my hook fianlly works after about 3 days of research and figuring out the bits and peaces of how things work, but there is one problem i have no idea how to change the parameters that come in to my "dummy" function before jumping on to the rest of the original function.
as u can see in my code i have tried to change the parameter simply in C++ but of course this does not work as im poping all the registers afterwards :/
anyways here is my dummy function which is what the hooked function jumps to:
static void __declspec(naked) MyDebugString(LPCTSTR lpOutputString) {
__asm {
PUSHAD
}
//Where i suppose i could run my code, but not be able to interfere with parameters :/
lpOutputString = L"new message!";
__asm {
POPAD
MOV EDI, EDI
PUSH EBP
MOV EBP, ESP
JMP Addr
}
original_DebugString(lpOutputString);
}
i understand why the code is not working as i said, i just can't see a proper solution to this, any help is greatly appreciated.
Every compiler has a protocol for calling functions using assembly language. The protocol may be stated deep in their manuals.
A faster method to find the function protocols is to have the compiler generate an assembly language listing for your function.
The best method for writing inline assembly is to:
First write the function in C++ source code
Next print out the assembly listing of the function.
Review and understand how the compiler generated assembly works.
Lastly, modify the internal assembly to suite your needs.
My preference is to write the C++ code as efficient as I can (or to help the compiler use optimal assembly language). I then review the assembly listing. I only change the inline assembly to invoke processor special features (such as block move instructions).
I am optimizing some hotspots in my application and compilation is done using gcc-arm.
Now, is there any chance that the following statements result in different assembler code:
static const pixel_t roundedwhite = 4294572537U;
return (packed >= roundedwhite) ? purewhite : packed;
// OR
const pixel_t roundedwhite = 4294572537U;
return (packed >= roundedwhite) ? purewhite : packed;
// OR
return (packed >= 4294572537U) ? purewhite : packed;
Is there any chance that my ARM compiler might produce the unwanted code for the first case or should this get optimized anyway?
I assume that it's pretty the same, but, unfortunately, I am not that sure in what gcc-arm does compared to ordinary gcc and I can't access the disassembly listing.
Thank you very much.
Call gcc with the -S flag and take a look at the assembly:
-S
Stop after the stage of compilation proper; do not assemble. The output is in the form of an assembler code file for each non-assembler input file specified.
I would it out try myself to include in the answer, but I don't have an ARM compiler handy.
One difference is surely that the first version, with static will use up some memory, even if it the value will get inlined in the expression. This would make sense if you want to calculate a more complex expression once and then store the result, but for this simple constant the static is unnecessary. That said, the compiler will very likely inline the value, as this is a very simple optimization and there is no reason for it not to.