Redundant template instantiations left over by MSVC [duplicate] - c++

I encountered something weird in the MSVC compiler.
it puts function template definition in assembly while optimization eliminates the need for them.
It seems that Clang and GCC successfully remove function definition at all but MSVC does not.
Can it be fixed?
main.cpp:
#include <iostream>
template <int n> int value() noexcept
{
return n;
}
int main()
{
return value<5>() + value<10>();
}
assembly:
int value<5>(void) PROC ; value<5>, COMDAT
mov eax, 5
ret 0
int value<5>(void) ENDP ; value<5>
int value<10>(void) PROC ; value<10>, COMDAT
mov eax, 10
ret 0
int value<10>(void) ENDP ; value<10>
main PROC ; COMDAT
mov eax, 15
ret 0
main ENDP
Sample code on godbolt

The /FA switch generates the listing file for each translation unit. Since this is before the linking stage, MSVC does not determine if those two functions are required anywhere else within the program, and are thus still included in the generated .asm file (Note: this may be for simplicity on MS's part, since it can treat templates the same as regular functions in the generated .obj file, though realistically there's no actual need to store them in the .obj file, as user17732522 points out in the comments).
During linking, MSVC determines that those functions are in fact not actually used / needed anywhere else, and thus can be eliminated (even if they were used elsewhere, since the result can be determined at compile time, they'd still be eliminated) from the compiled executable.
In order to see what's in the final compiled executable, you can view the executable through a disassembler. Example for using MSVC to do this, is put a breakpoint in the main function, run it, then when the breakpoint is hit, right click and "View Disassembly". In this, you will see that the two functions don't exist anymore.
You can also generate the Mapfile using /MAP option, which also shows it does not exist.
If I am reading the documentation correctly, it seems as those MS chose to include explicit instantiations of templates classes and functions because it "is useful" when creating libraries. Uninstantiated templates are not put into the obj files though.

Just add /Zc:inline to your compile statement and it does the same thing as clang/GCC if you also wrap the template in an anonymous namespace to ensure it does not have external visibility.
#include <iostream>
namespace
{
template <int n> int value() noexcept
{
return n;
}
}
or if you mark the template function inline
template <int n> inline int value() noexcept
{
return n;
}
Both result in:
main PROC
mov eax, 15
ret 0
main ENDP
The /Zc:inline (Remove unreferenced COMDAT) switch was added in VS 2015 Update 2 as part of the C++11 Standard conformance which allows this optimization.
It is off-by-default in command-line builds. In MSBuild, <RemoveUnreferencedCodeData> defaults to true.
See Microsoft Docs
OTHERWISE It will be cleaned up in the linker phase with /OPT:REF.

I compiled your code as given on my vs2022 in release mode. I get
return value<5>() + value<10>();
00007FF65CD21000 mov eax,0Fh
}
00007FF65CD21005 ret

Related

For a function that takes a const struct, does the compiler not optimize the function body?

I have the following piece of code:
#include <stdio.h>
typedef struct {
bool some_var;
} model_t;
const model_t model = {
true
};
void bla(const model_t *m) {
if (m->some_var) {
printf("Some var is true!\n");
}
else {
printf("Some var is false!\n");
}
}
int main() {
bla(&model);
}
I'd imagine that the compiler has all the information required to eliminate the else clause in the bla() function. The only code path that calls the function comes from main, and it takes in const model_t, so it should be able to figure out that that code path is not being used. However:
With GCC 12.2 we see that the second part is linked in.
If I inline the function this goes away though:
What am I missing here? And is there some way I can make the compiler do some smarter work? This happens in both C and C++ with -O3 and -Os.
The compiler does eliminate the else path in the inlined function in main. You're confusing the global function that is not called anyway and will be discarded by the linker eventually.
If you use the -fwhole-program flag to let the compiler know that no other file is going to be linked, that unused segment is discarded:
[See online]
Additionally, you use static or inline keywords to achieve something similar.
The compiler cannot optimize the else path away as the object file might be linked against any other code. This would be different if the function would be static or you use whole program optimization.
The only code path that calls the function comes from main
GCC can't know that unless you tell it so with -fwhole-program or maybe -flto (link-time optimization). Otherwise it has to assume that some static constructor in another compilation unit could call it. (Including possibly in a shared library, but another .cpp that you link with could do it.) e.g.
// another .cpp
typedef struct { bool some_var; } model_t;
void bla(const model_t *m); // declare the things from the other .cpp
int foo() {
model_t model = {false};
bla(&model);
return 1;
}
int some_global = foo(); // C++ only: non-constant static initializer.
Example on Godbolt with these lines in the same compilation unit as main, showing that it outputs both Some var is false! and then Some var is true!, without having changed the code for main.
ISO C doesn't have easy ways to get init code executed, but GNU C (and GCC specifically) have ways to get code run at startup, not called by main. This works even for shared libraries.
With -fwhole-program, the appropriate optimization would be simply not emitting a definition for it at all, as it's already inlined into the call-site in main. Like with inline (In C++, a promise that any other caller in another compilation unit can see its own definition of the function) or static (private to this compilation unit).
Inside main, it has optimized away the branch after constant propagation. If you ran the program, no branch would actually execute; nothing calls the stand-alone definition of the function.
The stand-alone definition of the function doesn't know that the only possible value for m is &model. If you put that inside the function, then it could optimize like you're expecting.
Only -fPIC would force the compiler to consider the possibility of symbol-interposition so the definition of const model_t model isn't the one that is in effect after (dynamic) linking. But you're compiling code for an executable not a library. (You can disable symbol-interposition for a global variable by giving it "hidden" visibility, __attribute__((visibility("hidden"))), or use -fvisibility=hidden to make that the default).

Strings missing from compiled .exe in Visual Studio

I am practicing reverse engineering software. I am using Microsoft Visual Studio. I created an empty project and then created an empty file which I called main.cpp. I then wrote the following code, compiled
int main()
{
char* str = "hello matthew";
int x = 15;
return 0;
}
When I brought the release version of the executable over to BinText and IdaPro, the string "hello matthew" was no where to be found. I could also never find the value 15 either in base 10 or hexadecimal.
I cannot begin to understand reverse engineering if I cannot find the references to the values I am looking for in the executable.
My theory is that because my program does absolutely nothing that the compiler just omitted it all, but I do not know for sure. Does anyone know why I cannot locate that string or the value 15 in the executable when I disassemble it?
I cannot begin to understand reverse engineering ...
The first step is to actually understand how the program is built out.
Before you can understand how to reverse a program, you need to understand how it's compiled and built; reversing a binary built for Windows is vastly different from reversing a binary for a *nix system.
To that, since you're using Visual Studio, you can see this answer (option 2) explaining how to enable the assembly output of your code. Alternatively if you're compiling via command line, you can pass /FAs and /Fa to generate the assembly inlined with the source.
Your code produces the following assembly:
; Listing generated by Microsoft (R) Optimizing Compiler Version 18.00.40629.0
TITLE C:\Code\test\test.cpp
.686P
.XMM
include listing.inc
.model flat
INCLUDELIB LIBCMT
INCLUDELIB OLDNAMES
CONST SEGMENT
$SG2548 DB 'hello matthew', 00H
CONST ENDS
PUBLIC _main
; Function compile flags: /Odtp
; File c:\code\test\test.cpp
_TEXT SEGMENT
_x$ = -8 ; size = 4
_str$ = -4 ; size = 4
_main PROC
; 2 : {
push ebp
mov ebp, esp
sub esp, 8
; 3 : char* str = "hello matthew";
mov DWORD PTR _str$[ebp], OFFSET $SG2548
; 4 :
; 5 : int x = 15;
mov DWORD PTR _x$[ebp], 15 ; 0000000fH
; 6 :
; 7 : return 0;
xor eax, eax
; 8 : }
mov esp, ebp
pop ebp
ret 0
_main ENDP
_TEXT ENDS
END
While this is helpful to understand how and what your code is doing, one of the best way to start reversing, is to throw a binary in a debugger, like attaching Visual Studio to an executable, and viewing the assembly as the program is running.
It can depend on what your after since a binary could potentially be obfuscated; that is to say that there could be strings within the binary, but they could be encrypted or just scrambled so as to be unreadable until decrypted/unscrambled by some function within the binary.
So just searching for strings won't necessarily give you anything, and trying to search for a specific binary value in the assembled code is like trying to find a needle in a stack of needles. Know why your trying to reverse a program, then attack that vector.
Does anyone know why I cannot locate that string or the value 15 in the executable when I disassemble it?
As has been mentioned, and as you have guessed, the "release" binary you're searching through was optimized, and the compiler just removed the unused variables so the assembly was essentially returning 0.
I hope that can help.
the main reason is that your code does nothing useful with x and str, so they are entirely redundant!!, and no need for them to even exist in your code! so the compiler automatically removes them from the compiled code "optimization"!!.
if you really want to see them in the compiled code under debuggers, you need to use them or simply tell the compiler not to optimize this part of the code!!
This is how to tell the compiler not to optimize these variable's locations by using volatile qualifier
#include <iostream>
int main(int argc, char** argv) {
const char* volatile str = "hello matthew";
volatile int x = 15;
return 0;
}
this shows that your variables are included in the compiled code in IDA Pro
or as I also said just use them!!!
#include <iostream>
int main(int argc, char** argv) {
const char* str = "hello matthew";
int x = 15;
std::cout << str << x;
return 0;
}

Can't get warnings to work for header-only library

I'm creating an header-only library, and I would like to get warnings for it displayed during compilation. However, it seems that only warnings for the "main" project including the library get displayed, but not for the library itself.
Is there a way I can force the compiler to check for warnings in the included library?
// main.cpp
#include "MyHeaderOnlyLib.hpp"
int main() { ... }
// Compile
g++ ./main.cpp -Wall -Wextra -pedantic ...
// Warnings get displayed for main.cpp, but not for MyHeaderOnlyLib.hpp
I'm finding MyHeaderOnlyLib.hpp via a CMake script, using find_package. I've checked the command executed by CMake, and it's using -I, not -isystem.
I've tried both including the library with <...> (when it's in the /usr/include/ directory), or locally with "...".
I suppose that you have a template library and you are complaining about the lack of warnings from its compilation. Don't look for bad #include path, that would end up as an error. Unfortunately, without specialization (unless the templates are used by the .cpp), the compiler has no way to interpret the templates reliably, let alone produce sensible warnings. Consider this:
#include <vector>
template <class C>
struct T {
bool pub_x(const std::vector<int> &v, int i)
{
return v.size() < i;
}
bool pub_y(const std::vector<int> &v, int i)
{
return v.size() < i;
}
};
typedef T<int> Tint; // will not help
bool pub_z(const std::vector<int> &v, unsigned int i) // if signed, produces warning
{
return v.size() < i;
}
class WarningMachine {
WarningMachine() // note that this is private
{
//T<int>().pub_y(std::vector<int>(), 10); // to produce warning for the template
}
};
int main()
{
//Tint().pub_y(std::vector<int>(), 10); // to produce warning for the template
return 0;
}
You can try it out in codepad. Note that the pub_z will immediately produce signed / unsigned comparison warning when compiled, despite never being called. It is a whole different story for the templates, though. Even if T::pub_y is called, T::pub_x still passes unnoticed without a warning. This depends on a compiler implementation, some compilers perform more aggressive checking once all the information is available, other tend to be lazy. Note that neither T::pub_x or T::pub_y depend on the template argument.
The only way to do it reliably is to specialize the templates and call the functions. Note that the code which does that does not need to be accessible for that (such as in WarningMachine), making it a candidate to be optimized away (but that depends), and also meaning that the values passed to the functions may not need to be valid values as the code will never run (that will save you allocating arrays or preparing whatever data the functions may need).
On the other hand, since you will have to write a lot of code to really check all the functions, you may as well pass valid data and check for result correctness and make it useful, instead of likely confusing the hell of anyone who reads the code after you (as is likely in the above case).

Inline class functions and shared library (dll) build

I'm trying to move some code into a shared library (works fine when compiled stand-alone) but getting some issues with class inline functions. mingw/gcc v4.7.2.
Part of the problem appears to be because I prefer to define my inline functions outside the class declaration (it keeps the class declaration neater and easier to read). I always thought this was acceptable and equivalent to defining within the class declaration ... but that doesn't appear to always be the case. I've created a simple sample to demonstrate the problems. (Obviously the dllexport would normally be in a macro to switch between import/export.)
Header:
// Uncomment one at a time to see how it compiles with: -O2 -Winline
//#define INLINE_OPTION 1 // implicit - builds without inline warnings
#define INLINE_OPTION 2 // simple external inline - gives inline warnings
//#define INLINE_OPTION 3 // external forced inline - gives inline errors
class __attribute__((dllexport)) Dummy {
public:
Dummy() : m_int{0} {}
~Dummy() {}
#if INLINE_OPTION == 1
int get_int() const { return m_int; }
#else
int get_int() const;
#endif
int do_something();
private:
int m_int;
};
#if INLINE_OPTION == 2
inline int Dummy::get_int() const
{ return m_int; }
#endif
#if INLINE_OPTION == 3
inline __attribute__((always_inline)) int Dummy::get_int() const
{ return m_int; }
#endif
.cpp file:
int Dummy::do_something()
{
int i = get_int();
i *= 2;
return i;
}
As noted above, with INLINE_OPTION == 1 (implicit, in-class inline definition) the code compiles with out warning.
With INLINE_OPTION == 2 (out-of-class inline definition) I get this warning: int Dummy::get_int() const' can never be inlined because it uses attributes conflicting with inlining [-Winline]
With INLINE_OPTION == 3 (trying to force inline), I get the same warning as above, AND I get this error: error: inlining failed in call to always_inline 'int Dummy::get_int() const': function not inlinable, with the information about it being called from the first line inside Dummy::do_something() in the .cpp file. Notice this is about trying to inline the function within the library itself! For simple accessor functions this could be very a very significant overhead.
Am I doing something wrong? Is it gcc right in treating the out-of-class-definition inline function differently to in-class function definitions? (Am I really forced to clutter the class declaration?)
Note: The problem doesn't just effect things that I declare inline. It also effects anything declared as constexpr and even destructors declared as "= default" when inheritance is involved.
Edit:
Just tried with mingw64 / gcc v4.8.0 with the same results. Note that this includes the fact that option 1 does NOT inline in do_something (I checked the assembler output), so apparently the only difference between option 1 and option 2 is that only option 2 will gives the -Winline warning.
I don't know nothing about how to make shared libraries on Windows. In linux/OSX no special treatment is required in the source code, so that both shared (.so) and ordinary (.a) libraries can be made from the same sources without special treatment.
If you really do need a special attribute for symbols to be exported into shared libraries, then you may simply split the code, e.g.
namespace implementation_details {
class __attribute__((dllexport)) DummyBase
{
protected:
DummyBase() : m_int{0} {}
~DummyBase() {}
int do_something();
int m_int;
};
}
struct Dummy: private implementation_details::DummyBase
{
using implementation_details::DummyBase::do_something;
int get_int() const noexcept;
};
inline __attribute__((always_inline)) int Dummy::get_int() const noexcept
{ return m_int; }
Ok maybe my answer was a little cryptic... let me give you a quick example of what I mean using your code snippets.
dummy.h:
#ifndef _DUMMY_H_
#define _DUMMY_H_
class __attribute__((dllexport)) Dummy {
public:
Dummy() : m_int{0} {}
~Dummy() {}
int get_int() const;
int do_something();
private:
int m_int;
};
// here goes the include of the implementation header file
#include "dummy.h.impl"
#endif // _DUMMY_H_
dummy.h.impl:
// there will be no symbol for Dummy::get_int() in the dll.
// Only its contents are copied to the places where it
// is used. Placing this in the header gives other binaries
// you build with this lib the chance to do the same.
inline int Dummy::get_int() const
{ return m_int; }
Of course you could place the inline definitions just below your class declaration in the same header file. However, I find this still violates the separation of declaration and definition.
dummy.cpp:
// this method will become a symbol in the library because
// it is a C++ source file.
int Dummy::do_something()
{
// i would if i knew what to do...
return 0;
}
Hope I could be of help.
The edit I did on another post didn't seem to take, and anyway it seems some additional clarity may be appropriate, so I am posting details I've sent to another forum. In the code below class C is the work around to this problem - export only the non-inline members, not the whole class. As noted in the comments elsewhere, __declspec(dllexport) and __attribute__((dllexport)) are equivalent.
test.hpp
class __declspec(dllexport) A {
public:
int fa() { return m; }
int ga();
private:
int m{0};
};
class __declspec(dllexport) B {
public:
int fb();
int gb();
private:
int m{0};
};
inline int B::fb() { return m; }
class C {
public:
int fc() { return m; }
__declspec(dllexport) int gc();
private:
int m{0};
};
test.cpp
#include "test.hpp"
int A::ga() { return (fa() + 1); }
int B::gb() { return (fb() + 1); }
int C::gc() { return (fc() + 1); }
If you compile this with options: -std=c++11 -O2 -S -Winline (using mingw/ming64 with gcc v4.7.2 or v4.8.0) you can see the assembler produced for the library functions ga, gb and gc look like this:
ga:
subq $40, %rsp
.seh_stackalloc 40
.seh_endprologue
call _ZN1A2faEv
addl $1, %eax
addq $40, %rsp
ret
gb:
subq $40, %rsp
.seh_stackalloc 40
.seh_endprologue
call _ZN1B2fbEv
addl $1, %eax
addq $40, %rsp
ret
gc:
.seh_endprologue
movl (%rcx), %eax
addl $1, %eax
ret
and you get the warnings:
warning: function 'int B::fb()' can never be inlined because it uses attributes conflicting with inlining [-Winline]
warning: inlining failed in call to 'int B::fb()': function not inlinable [-Winline] (called from B::gb())
Notice that there were no warnings about fa not inlining (which is, I think, expected). But also notice that ga, gb and gc are all library functions. Whatever you may think about whether the inline functions themselves should be exported, there is no good reason why the inlines cannot be inlined inside the library. Hence I consider this a bug in the compiler.
Take a look around at well regarded code and see how much you find that exports only explicit members. For example those few parts of boost that get compiled into a library (eg: regex), use the class A technique, which means the many accessor functions are not being inlined inside the library.
But, all that aside, the answer for now is the class C technique (obviously in real code this has to be enclosed in a macro to switch between export and import as you would normally at the class level).
This is not a compiler bug, as some suggested. In C++ if function is inline, it has to be declared inline in every declaration. There are 5 properties that have to be met and one of them is:
An inline function with external linkage (e.g. not declared static) has the following additional properties:
1) It must be declared inline in every translation unit.
...
In your example, you first declared function Dummy::get_int() as non-inline inside class definition. It means that function cannot be redeclared as inline
Source: http://en.cppreference.com/w/cpp/language/inline
BTW: inline specifier works differently in C, where you can declare both inline and non-inline versions of the same function. Still, you have to implement both and ensure that they do the same thing.
Why don't you declare your function inline in class declaration (inline int get_int() const;)? Maybe error is there?
The compiler cannot inline a function which has to be exported in a dll. After all when called from executable linked with your dll the function should have an address. Most probably the call from do_something will be inlined but in the general case i think it's just impossible

Running a function on a separately allocated stack

I am trying to run a function on a separately allocated stack.
I want to keep the stack for later so I can restore it and resume the function.
The following code compiles and runs, but nothing prints to the screen.
#include <cstdlib>
#include <csetjmp>
#include <iostream>
using namespace std;
unsigned char stack[65535];
unsigned char *base_ptr = stack + 65535 - 1;
unsigned char *old_stack;
unsigned char *old_base;
void function()
{
cout << "hello world" << endl;
}
int main()
{
__asm
{
mov old_base, ebp
mov old_stack, esp
mov ebp, base_ptr
mov esp, base_ptr
call function
mov ebp, old_base
mov esp, old_stack
}
}
using vs2012/win8/intel Q9650
Welcome to C++ and name mangling. Function names in C++ are mangled by the compiler (such that using gcc function becomes _Z8functionv for me). This is to facilitate function overloading. The compiler keeps track of the actual names that it has given the different functions in the background so you aren't aware of it. This is a problem for any other language that tries to interact with C++.
This code won't link on my computer.
The solutions:
1) compile with g++ and pass the -S flag (so g++ -S test.cpp). And then take a look at the assembly output (cat test.s) to see what the function is called. Then change the name in "call function" to be "call _Z8functionv" (for me - it could easily be different for you).
2) use C: change the cout << to a printf statement and the above should work.
I take it that you aren't using gcc though (as the assembler is back to front for gas - I had to switch all the operands on the assembler around).
Actually I don't see any problem with your code.
Your sample taken as-is compiles, links and runs as expected.
Perhaps your problem with console settings, or some global STL/CRT initialization or whatever. Anyway, you may put a breakpoint inside your function to ensure you're getting there.
According to Intel's x86 documentation for MOV, page 3-403, you should load the SS register immediately before loading a new ESP value. That blocks any interrupts from running until ESP has been assigned.