Selectively omit frame pointer in MSVC - c++

In GCC i can selectively set optimization flags for specific function, so this:
void func() {}
generates:
func():
push rbp
mov rbp, rsp
nop
pop rbp
ret
And this:
__attribute__((optimize("-fomit-frame-pointer")))
void func() {}
generates:
func():
nop
ret
How can i do the same in visual studio?

There's a command line parameter to the compiler, /Oy, this makes the compiler to omit frame pointers. You can achieve the same with #pragma:
#pragma optimize("y", on)
int foo(int a) { // foo will be compiled with omitted frame pointers
return a;
}
#pragma optimize("y", off)
Here, foo() will be compiled with omitted frame pointers.
Note: As I see, you have to build an optimized build to make this option have an effect. So, either supply some optimization flag to the compiler (like "/Og"), or include "g" into the pragma: #pragma optimize("gy", ...)
(I've checked this with Visual Studio 2015)

Related

Extern C++ compiled fn in asm

I'm following an OS dev series by poncho on yt.
The 6th video linked C++ with assembly code using extern but the code was linked as C code as it was extern "C" void _start().
In ExtendedProgram.asm, _start was called like:
[extern _start]
Start64bit:
mov edi, 0xb8000
mov rax, 0x1f201f201f201f20
mov ecx, 500
rep stosq
call _start
jmp $
The Kernel.cpp had:
extern "C" void _start() {
return;
}
One of the comments in the video shows that for C++ a different name, _Z6_startv is
created.
So to try out I modified my Kernel.cpp as:
extern void _Z6_startv() { return; }
And also modified the ExtendedProgram.asm by replacing _start with _Z6_startv but the linker complained,
/usr/local/x86_64elfgcc/bin/x86_64-elf-ld: warning: cannot find entry symbol _start; defaulting to 0000000000008000
then I tried,
Kernel.cpp
extern "C++" void _Z6_startv() { return; } // I didn't even know wut i was doin'
And linker complained again.
I did try some other combinations & methods, all ending miserably, eventually landing here on Stack Overflow.
So, the question:
How to compile the function as a C++ function and link it to assembly?
there is a confusion between symbols:
The name of your function start will be mangled to _Z6_startv at compilation which means that the symbols that the linker (and your asm code) can use is _Z6_startv. mangling is what c++ compilers normaly do, but extern "C" tell the compiler to treat the function as if declared for a C program where no mangling happen so _start stay _start which means you do not need to change anything from the code you initialy showed.
or if you want to remove the extern "C"
what you want to do is:
[extern _Z6_startv]
Start64bit:
mov edi, 0xb8000
mov rax, 0x1f201f201f201f20
mov ecx, 500
rep stosq
call _Z6_startv
jmp $

std::mutex::lock() produces weird (and unnecessary) asm code

I was checking generated asm for some of my code and my eye caught some interesting stuff:
#include <mutex>
std::mutex m;
void foo()
{
m.lock();
}
generated asm code (x86-64 gcc 9.2, -std=c++11 -O2):
foo():
mov eax, OFFSET FLAT:_ZL28__gthrw___pthread_key_createPjPFvPvE
test rax, rax
je .L10 // (1) we can simply bypass lock() call?
sub rsp, 8
mov edi, OFFSET FLAT:m
call __gthrw_pthread_mutex_lock(pthread_mutex_t*)
test eax, eax
jne .L14 // (2) waste of space that will never be executed
add rsp, 8
ret
.L10:
ret
.L14:
mov edi, eax
call std::__throw_system_error(int)
m:
.zero 40
Questions:
part (1) -- gcc specific:
what it is doing? (allocating TLS entry?)
how failing that operation allows us to silently bypass lock() call?
part (2) -- looks like each compiler is affected:
std::mutex::lock() can throw according to standard
... but it never does in correct code (as discussed in related SO posts), for all intents and purposes std::mutex::lock() is always noexcept in correct code
is it possible to let compiler know so that it stops emitting unnecessary tests and instruction blocks (like .L14 above)?
Note: I can't see how throwing from std::mutex::lock() is better than simply abort()ing. In both cases your program is screwed (no one expects it to fail), but at least in latter case you end up with considerably smaller asm code ("pay only for something you use", remember?).
It seems that you are misinterpreting the asm output. What you see is not the code of foo but the inlined code of mutex::lock.
From https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/std/mutex:
void lock() // in class mutex
{
int __e = __gthread_recursive_mutex_lock(&_M_mutex);
// EINVAL, EAGAIN, EBUSY, EINVAL, EDEADLK(may)
if (__e)
__throw_system_error(__e);
}
From https://gcc.gnu.org/onlinedocs/libstdc++/libstdc++-html-USERS-4.0/gthr-default_8h-source.html:
static inline int __gthread_recursive_mutex_lock (__gthread_recursive_mutex_t *mutex)
{
return __gthread_mutex_lock (mutex);
}
static inline int __gthread_mutex_lock (__gthread_mutex_t *mutex)
{
if (__gthread_active_p ())
return pthread_mutex_lock (mutex);
else
return 0;
}
The names do not exactly match your asm code, so I probably looked at a different libstdc++ source, but to me it looks like the compiler inlined mutex::lock into your function foo and it also inlined the functions that mutex::lock is calling.

clang ignoring attribute noinline

I expected __attribute__((noinline)), when added to a function, to make sure that that function gets emitted. This works with gcc, but clang still seems to inline it.
Here is an example, which you can also open on Godbolt:
namespace {
__attribute__((noinline))
int inner_noinline() {
return 3;
}
int inner_inline() {
return 4;
}
int outer() {
return inner_noinline() + inner_inline();
}
}
int main() {
return outer();
}
When build with -O3, gcc emits inner_noinline, but not inner_inline:
(anonymous namespace)::inner_noinline():
mov eax, 3
ret
main:
call (anonymous namespace)::inner_noinline()
add eax, 4
ret
Clang insists on inlining it:
main: # #main
mov eax, 7
ret
If adding a parameter to the functions and letting them perform some trivial work, clang respects the noinline attribute: https://godbolt.org/z/NNSVab
Shouldn't noinline be independent of how complex the function is? What am I missing?
__attribute__((noinline)) prevents the compiler from inlining the function. It doesn't prevent it from doing constant folding. In this case, the compiler was able to recognize that there was no need to call inner_noinline, either as an inline insertion or an out-of-line call. It could just replace the function call with the constant 3.
It sounds like you want to use the optnone attribute instead, to prevent the compiler from applying even the most obvious of optimizations (as this one is).

MSVC 2012 generates different vtable pointer offsets for different files

Let's say I've got in X64 release configuration
It's a an obfuscated code snippet...
// Hdr1.h
// Dozen of includes
class Cls1
{
public:
Cls1();
virtual void bar();
// ...
protected:
// about 7 fields where some of them are of complex template type.
bool isFlag1 : 1;
bool isFlag2 : 1;
};
// Hdr2
// Dozens of includes
class Cls2
{
public:
// ...
void foo();
};
I've got separate translation units to implement these classes. Say from foo I try to access virtual method of Cls1::bar and I get a crash(access violation).
void Cls2::foo()
{
//...
Cls1 * pCls1 = // somehow I get this goddamn pointer
pCls1->bar(); // Here I crash
}
From disassembly I see that Cls1::Cls1 puts vtable ptr at offset 8 to the very beginning of this. From disassembly of Cls2::foo I see that it takes pointer to vtable from offset zero. Debugger is also unable to see this vtable correctly. If I manually get vtable at offset 8 - addresses appear to be correct in this table.
The question is - why could this happen, what pragma could lead to this or anything else? Compilation flags are the same for both translation units.
Below I add a bit of disassembly:
This is a normal case that I face across the code:
Module1!CSomeOkClass::CreateObjInstance:
sub rsp,28h
mov edx,4 ; own inlined operator new
lea ecx,[rdx+34h] ; own inlined operator new
call OwnMemoryRoutines!OwnMalloc (someAddr) ; own inlined operator new
xor edx,edx
test rax,rax
je Module1!CSomeOkClass::CreateObjInstance+0x40 (someAddr)
**lea rcx,[Module1!CSomeOkClass::`vftable' (someAddr)] ; Inlined CSomeOkClass::CSomeOkClass < vtable ptr**
mov qword ptr [rax+8],rdx ; Inlined CSomeOkClass::CSomeOkClass
mov qword ptr [rax+10h],rdx ; Inlined CSomeOkClass::CSomeOkClass
mov qword ptr [rax+18h],rdx ; Inlined CSomeOkClass::CSomeOkClass
mov byte ptr [rax+20h],dl ; Inlined CSomeOkClass::CSomeOkClass
mov qword ptr [rax+28h],rdx ; Inlined CSomeOkClass::CSomeOkClass
**mov qword ptr [rax],rcx ; Inlined CSomeOkClass::CSomeOkClass < offset zero**
Now let's see what I've got for Cls1::Cls1:
Module1!Cls1::Cls1:
mov qword ptr [rsp+8],rbx
push rdi
sub rsp,20h
**lea rax,[Module1!Cls1::`vftable' (someAddress)] ; vtable address**
mov rbx,rdx
mov rdi,rcx
**mov qword ptr [rcx+8],rax ; Places at offset 8**
I assure you that Cls2 expects pointer to vtable to be at offset zero.
Compilation options are:
/nologo /WX /W3 /MD /c /Zc:wchar_t /Zc:forScope /Zm192 /bigobj /d2Zi+ /Zi /Oi /GS- /GF /Oy- /fp:fast /Gm- /Ox /Gy /Ob2 /GR- /Os
I noticed that Cls1::Cls1 heavily uses SSE instructions inlined from intrinsics.
Compiler version:
Microsoft (R) C/C++ Optimizing Compiler Version 17.00.50727.1 for x64
Please pay attention that this code works ok on different platforms/compilers.
I managed to figure out that the problem was in fact with this bitfield I have in the very end of Cl1 definition. The ctor generated places pointer to vtable at offset zero if I make isFlag1 + isFlag2 ordinary bools. These flags are initialized in the ctor's initializer list. By commenting out class's code one by line I narrowed down the problem to this bitfield. In order to investigate this I used WinDbg, /P compiler option, compiled cpp unit manually with the original flags provided + /FAs /Fa. It appears that it is a compiler's bug.
I managed to figure out that the problem was in fact with this bitfield I have in the very end of Cl1 definition. The ctor generated places pointer to vtable at offset zero if I make isFlag1 + isFlag2 ordinary bools. These flags are initialized in the ctor's initializer list. By commenting out class's code one by line I narrowed down the problem to this bitfield. In order to investigate this I used WinDbg, /P compiler option, compiled cpp unit manually with the original flags provided + /FAs /Fa. It appears that it is a compiler's bug.

Inserting a comment in __asm results in C2400 error (VS2012)

I was trying to check the compiled assembler of some code in VS 2012. I added two lines (before and after my code) as such:
__asm ; it begins here!
// My code
__asm ; it ends here!
However, VS didn't like that. I got
error C2400: inline assembler syntax error in 'opcode'; found 'bad token'
So I added a NOP, which I didn't want to:
__asm NOP ; Comment!
That worked fine. My question is twofold.
Why didn't VS allow me to add an assembly comment?
Is there a different way to add an assembly comment without adding an instruction, including NOP?
The reason it doesn't work is that __asm is a keyword, just like int is a keyword, it cannot appear by itself and must follow the proper syntax. Take the following bit of code as an example:
int main()
{
int // here's a comment, but it's ignored by the compiler
return 0;
}
The following code will fail with a compilation error, more specifically in VS2012 you get error C2143: syntax error : missing ';' before 'return'. This is an obvious error since we do not have the ending semi-colon to denote end of instruction; add the semi-colon and it compiles fine because we did not dis-obey the syntax of the C (or C++ in this case) language:
int main()
{
int // here's a comment, but it's ignored by the compiler
; // white space and comments are ignored by the compiler
return 0;
}
The same is true of the following code:
int main()
{
__asm ; here's a comment but it's ignored
return 0;
}
Except here we get the error error C2400: inline assembler syntax error in 'opcode'; found 'constant', becuase it's treating everything after the __asm keyword as an assembler instruction and the comment is being rightfully ignored .. so the following code WOULD work:
int main()
{
__asm ; here's a comment but it's ignored
NOP ; white space and comments are ignored by the compiler
__asm {; here's an __asm 'block'
} // outside of __asm block so only C style comments work
return 0;
}
So that answers your first question: Why didn't VS allow me to add an assembly comment?.. because it is a syntax error.
Now for your second question: Is there a different way to add an assembly comment without adding an instruction, including NOP?
Directly, no, there is not, but indirectly, yes there is. It's worth noting that the __asm keyword gets compiled into inline assembly in your program, so comments will be removed from the compiled assembly just as if it were a standard C/C++ comment, so trying to 'force' a comment in your assembly via that method is not necessary, instead, you can use the /FAs compiler flag and it will generate the assembly (machine code) mixed with the source, example:
Given the following (very simple) code:
int main()
{
// here's a normal comment
__asm { ; here's an asm comment and empty block
} // here's another normal comment
return 0;
}
When compiled with the /FAs compiler flag, the file.asm that was produced had the following output in it:
; Listing generated by Microsoft (R) Optimizing Compiler Version 18.00.31101.0
TITLE C:\test\file.cpp
.686P
.XMM
include listing.inc
.model flat
INCLUDELIB LIBCMT
INCLUDELIB OLDNAMES
PUBLIC _main
; Function compile flags: /Odtp
; File c:\test\file.cpp
_TEXT SEGMENT
_main PROC
; 2 : {
push ebp
mov ebp, esp
; 3 : // here's a normal comment
; 4 : __asm { ; here's an asm comment and empty block
; 5 : } // here's another normal comment
; 6 : return 0;
xor eax, eax
; 7 : }
pop ebp
ret 0
_main ENDP
_TEXT ENDS
END
Notice how it includes the source and comments. If this code did more, you would see more assembly and the source associated with that as well.
If you're wanting to put comments in the inline assembly itself, then you can use normal C/C++ style comments as well as assembly comments within the __asm block itself:
int main()
{
// here's a C comment
__asm { ; here's an asm comment
// some other comments
NOP ; asm type comment
NOP // C style comment
} // here's another comment
return 0;
}
Hope that can help.
EDIT:
It should be noted the following bit of code also compiles without error and I'm not 100% sure why:
int main()
{
__asm
__asm ; comment
// also just doing it on a single line works too: __asm __asm
return 0;
}
Compiling this code with the single __asm ; comment gives the compilation error, but with both it compiles fine; adding instructions to the above code and inspecting the .asm output shows that the second __asm is ignored for any other assembly commands preceding it. So I'm not 100% sure if this is a parsing bug or part of the __asm keyword syntax as there's no documentation on this behavior.
On Linux, g++ accepts this:
__asm(";myComment");
and outputs, when you run g++ -S -O3 filename.cpp:
# 5 "filename.cpp" 1
;myComment
However, clang++ does not like it, and complains with this, when you run clang++ -S -O3 filename.cpp:
filename.cpp:5:9: error: invalid instruction mnemonic 'myComment'
__asm(";myComment");
^
<inline asm>:1:3: note: instantiated into assembly here
;myComment
^~~~~~~~~
I was, however, able to get both g++ and clang++ to accept:
__asm("//myComment");
which outputs the same comment as in the assembly output above, for both compilers.
What clued me into this, as I was unable to find it anywhere else on the internet, was reading from here:
Microsoft Specific
Instructions in an __asm block can use assembly-language comments:
C++
__asm mov ax, offset buff ; Load address of buff
Because C macros expand into a single logical line, avoid using
assembly-language comments in macros. (See Defining __asm Blocks as C
Macros.) An __asm block can also contain C-style comments; for more
information, see Using C or C++ in __asm Blocks.
END Microsoft Specific
This page then links to here and here. These provide more information on the matter.