Linker removes unreferenced code

Linker removes unreferenced code - c++

I'm currently working on a modular project and some of our systems won't work, I'll try to explain what we're trying to do.
We've got a main project that is extended by a number of modules (dlls), these modules can have bootstrapper code that will register itself before the main loop starts. It will register itself using our bootstrapping system, however the problem for us is that the modules will not register itself unless we specifically call a function from said modules. We think this is because the linker removes unreferenced code as part of optimizations (this also happens in debug mode).
The main function is set up as follows:
#include <bootstrap/bootstrapper.hpp>
#include <module/module.hpp>
int main()
{
// foo is an empty function in the module header file
foo(); // if I were to remove this empty function,
// the modules bootstrapping code will not execute
return fade::bootstrap::run();
}
without the foo() function the bootstrapping code doesn't get executed
void foo()
{
}
namespace
{
std::unique_ptr<game> game_;
FADE_BOOTSTRAP_MODULE(module_game) // registers itself to the bootstrapper
}
We've tried a number of things such as the linker options:
/INCLUDE
/OPT:NOREF
/EXPORT
But to no avail, either it gives us undefined symbol errors or it does nothing at all.
Is there anything we can do so that the unreferenced code doesn't get optimized away? We want to keep our project modular and cross platform so we'd rather not hardcode any solutions to our main function.

Related

Code coverage missing a branch if a function takes reference parameter

I have a following function which takes a reference parameter:
#include <iostream>
class A { static void TestA(const int &y) };
void A::TestA(const int &y) { std::cout << y; }
int main()
{
A::TestA(2);
return 0;
}
In my (lcov) code coverage with google unit tests, it is saying missing a branch with TestA() function, and symbols list have a stack_chk_fail symbol added. If I change the function parameter to non-reference then coverage is 100%.
I am using g++ compiler.
Am I missing anything ?
Thanks

The compiler inlines Test into main (because that's what a good compiler does). However, it also has to create code for Test because it has external linkage. Effectively, the code for the function exists twice: Once inlined into main and once in the code for Test that the linker can link with other compilation units.
If your compiler is bad with code attribution (in the debug symbols) for inlined functions (hello MSVC?) then your profiler will give you exactly the result you see: Executing the program does not lead to coverage for Test because no piece of the binary that is executed (i.e. main) has any line attribution into Test.
Changing the parameter type may affect inlining, but it's more likely that it changes how debug symbols are generated.
To verify this, step through the program with a debugger, with a breakpoint in Test. If that breakpoint is not hit when running main, your coverage tool won't see coverage for that line either. Or, if you really want, look into the debug symbols manually to see which lines have attribution. In Visual Studio you can also look at the disassembly while debugging (it will show the associated code lines).
For the above reasons you will generally get more reliable coverage results if you do coverage runs with debug builds (where e.g. inlining won't happen).

How does it work and compile a C++ extension of TCL with a Macro and no main function

I have a working set of TCL script plus C++ extension but I dont know exactly how it works and how was it compiled. I am using gcc and linux Arch.
It works as follows: when we execute the test.tcl script it will pass some values to an object of a class defined into the C++ extension. Using these values the extension using a macro give some result and print some graphics.
In the test.tcl scrip I have:
#!object
use_namespace myClass
proc simulate {} {
uplevel #0 {
set running 1
for {} {$running} { } {
moveBugs
draw .world.canvas
.statusbar configure -text "t:[tstep]"
}
}
}
set toroidal 1
set nx 100
set ny 100
set mv_dist 4
setup $nx $ny $mv_dist $toroidal
addBugs 100
# size of a grid cell in pixels
set scale 5
myClass.scale 5
The object.cc looks like:
#include //some includes here
MyClass myClass;
make_model(myClass); // --> this is a macro!
The Macro "make_model(myClass)" expands as follows:
namespace myClass_ns { DEFINE_MYLIB_LIBRARY; int TCL_obj_myClass
(mylib::TCL_obj_init(myClass),TCL_obj(mylib::null_TCL_obj,
(std::string)"myClass",myClass),1); };
The Class definition is:
class MyClass:
{
public:
int tstep; //timestep - updated each time moveBugs is called
int scale; //no. pixels used to represent bugs
void setup(TCL_args args) {
int nx=args, ny=args, moveDistance=args;
bool toroidal=args;
Space::setup(nx,ny,moveDistance,toroidal);
}
The whole thing creates a cell-grid with some dots (bugs) moving from one cell to another.
My questions are:
How do the class methods and variables get the script values?
How is possible to have c++ code and compile it without a main function?
What is that macro doing there in the extension and how it works??
Thanks

Whenever a command in Tcl is run, it calls a function that implements that command. That function is written in a language like C or C++, and it is passed in the arguments (either as strings or Tcl_Obj* values). A full extension will also include a function to do the library initialisation; the function (which is external, has C linkage, and which has a name like Foo_Init if your library is foo.dll) does basic setting up tasks like registering the implementation functions as commands, and it's explicit because it takes a reference to the interpreter context that is being initialised.
The implementation functions can do pretty much anything they want, but to return a result they use one of the functions Tcl_SetResult, Tcl_SetObjResult, etc. and they have to return an int containing the relevant exception code. The usual useful ones are TCL_OK (for no exception) and TCL_ERROR (for stuff's gone wrong). This is a C API, so C++ exceptions aren't allowed.
It's possible to use C++ instance methods as command implementations, provided there's a binding function in between. In particular, the function has to get the instance pointer by casting a ClientData value (an alias for void* in reality, remember this is mostly a C API) and then invoking the method on that. It's a small amount of code.
Compiling things is just building a DLL that links against the right library (or libraries, as required). While extensions are usually recommended to link against the stub library, it's not necessary when you're just developing and testing on one machine. But if you're linking against the Tcl DLL, you'd better make sure that the code gets loaded into a tclsh that uses that DLL. Stub libraries get rid of that tight binding, providing pretty strong ABI stability, but are little more work to set up; you need to define the right C macro to turn them on and you need to do an extra API call in your initialisation function.
I assume you already know how to compile and link C++ code. I won't tell you how to do it, but there's bound to be other questions here on Stack Overflow if you need assistance.
Using the code? For an extension, it's basically just:
# Dynamically load the DLL and call the init function
load /path/to/your.dll
# Commands are all present, so use them
NewCommand 3
There are some extra steps later on to turn a DLL into a proper Tcl package, abstracting code that uses the DLL away from the fact that it is exactly that DLL and so on, but they're not something to worry about until you've got things working a lot more.

how to start the execution of a program in c/c++ from a different function,but not main() [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Does the program execution always start from main in C?
i want to start the execution of my program which contains 2 functions (excluding main)
void check(void)
void execute(void)
i want to start my execution from check(), is it possible in c/c++?

You can do this with a simple wrapper:
int main()
{
check();
}
You can't portably do it in any other way since the standard explicitly specifies main as the program entry point.
EDIT for comment: Don't ever do this. In C++ you could abuse static initialization to have check called before main during static init, but you still can't call main legally from check. You can just have check run first. As noted in a comment this doesn't work in C because it requires constant initializers.
// At file scope.
bool abuse_the_language = (check(), true);
int main()
{
// No op if desired.
}

Various linkers have various options to specify the entry point. Eg. Microsoft linker uses /ENTRY:function:
The /ENTRY option specifies an entry point function as the starting
address for an .exe file or DLL.
GNU's ld uses the -e or ENTRY() in the command file.
Needles to say, modifying the entry point is a very advanced feature which you must absolutely understand how it works. For one, it may cause skipping the loading the standard libraries initialization.

int main()
{
check();
return 0;
}

Calling check from main seems like the most logical solution, but you could still explore using /ENTRY to define another entry point for your application. See here for more info.

You cannot start in something other than main, although there are ways to have some code execute before main.
Putting code in a static initialization block will have the code run prior to main; however, it won't be 100% controllable. while you can be assured it runs prior to main, you cannot specify the order that two static initialization blocks will run prior to them both executing before main.
Linkers and loaders both have the concept of main held as a shared "understood" start of a C / C++ program; however, there is code that runs prior to main. This code is responsible for "setting up the environment" of the program (things like setting up stdin or cin). By putting code in a static initialization block, you effectively say, "hey you need to do this too to have the right environment". Generally, this should be something small, that can stand independently in execution order of other items.
If you need two or three things to execute in order before main, then make them into proper functions and call them at the beginning of main.

There is a contrived way to achieve that, but it is nothing more than a hack.
The idea is to create a static library containing the main function, and make it call your "check" function.
The linker will resolve the symbol when linking against your "program", and your "program" code will indeed not have a main by itself.
This is NOT recommended, unless you have very specific needs (an example that pops to mind is Windows Screensavers, as the helper library that comes with the Windows SDK has a main function that performs specific initialization like parsing the command line).

It may be supportted by the compiler. For example, gcc, you can use -nostartfiles and --entry=xxx to set the entry point of the program. The default entry point is _start, which will call the function main.

You can "intercept" the call to main by creating an object before the main starts. The constructor needs to execute your function.
#include <iostream>
void foo()
{
// do stuff
std::cout<<"exiting from foo" <<std::endl;
}
struct A
{
A(){ foo(); };
};
static A a;
int main()
{
// something
std::cout<<"starting main()" <<std::endl;
}

I have found solution to my own question.
we can simply use
#pragma startup function-name <priority>
#pragma exit function-name <priority>
These two pragmas allow the program to specify function(s) that should be called either upon program startup (before the main function is called), or program exit (just before the program terminates through _exit).
The specified function-name must be a previously declared function taking no arguments and returning void; in other words, it should be declared as:
void func(void);
The optional priority parameter should be an integer in the range 64 to 255. The highest priority is 0. Functions with higher priorities are called first at startup and last at exit. If you don't specify a priority, it defaults to 100.
thanks!

Get a function declaration from another llvm::Module

In my application i have 2 LLVM modules - the runtime one (which contains void foo(int * a) function definition) and executable one (which i'm creating using LLVM C++ API).
In my executable module i create int main(int argc, char ** argv) and want to put llvm::CallInst into it's body, which would call foo() function from runtime module.
Here is my code:
Function * fooF = Function::Create(runtimeModule->getFunction("foo")->getFunctionType(),
GlobalValue::WeakAnyLinkage, "foo", execModule);
After that, i link two modules together:
Linker linker("blabla", execModule, false);
linker.LinkInFile("/path/to/runtime.bc", false);
execModule = linker.releaseModule();
This compiles OK, however when i run Verifier pass on linked module i get:
Global is external, but doesn't have external or dllimport or weak linkage!
void (%i32*)* #foo
invalid linkage type for function declaration
void (%i32*)* #foo
It's worth mentioning, that all globals in runtime module are internalized using Internalize pass. After linking, but before running Verifier, i'm running Dead Global Elimination pass amongst some other optimizations. And when i do dump() on resulting module, i see, that #foo which is coming from runtime module gets removed too, despite it's used by main(). It seems, LLVM thinks that #foo definition in runtime and #foo declaration in executable are unrelated.
I've tried to play with linkage types - no luck.
So, what is the right way to create a call to the function from another module?

Ok, i've fixed it, but i still can't understand what was the problem. During building of my runtime bitcode module, i've been applying internalize transformation on it. So i tried to do this at run-time after linking and it helped me.
Ah, and i've been using GlobalValue::WeakAnyLinkage.

Do you really need a main() in C++?

From what I can tell you can kick off all the action in a constructor when you create a global object. So do you really need a main() function in C++ or is it just legacy?
I can understand that it could be considered bad practice to do so. I'm just asking out of curiosity.

If you want to run your program on a hosted C++ implementation, you need a main function. That's just how things are defined. You can leave it empty if you want of course. On the technical side of things, the linker wants to resolve the main symbol that's used in the runtime library (which has no clue of your special intentions to omit it - it just still emits a call to it). If the Standard specified that main is optional, then of course implementations could come up with solutions, but that would need to happen in a parallel universe.
If you go with the "Execution starts in the constructor of my global object", beware that you set yourself up to many problems related to the order of constructions of namespace scope objects defined in different translation units (So what is the entry point? The answer is: You will have multiple entry points, and what entry point is executed first is unspecified!). In C++03 you aren't even guaranteed that cout is properly constructed (in C++0x you have a guarantee that it is, before any code tries to use it, as long as there is a preceeding include of <iostream>).
You don't have those problems and don't need to work around them (wich can be very tricky) if you properly start executing things in ::main.
As mentioned in the comments, there are however several systems that hide main from the user by having him tell the name of a class which is instantiated within main. This works similar to the following example
class MyApp {
public:
MyApp(std::vector<std::string> const& argv);
int run() {
/* code comes here */
return 0;
};
};
IMPLEMENT_APP(MyApp);
To the user of this system, it's completely hidden that there is a main function, but that macro would actually define such a main function as follows
#define IMPLEMENT_APP(AppClass) \
int main(int argc, char **argv) { \
AppClass m(std::vector<std::string>(argv, argv + argc)); \
return m.run(); \
}
This doesn't have the problem of unspecified order of construction mentioned above. The benefit of them is that they work with different forms of higher level entry points. For example, Windows GUI programs start up in a WinMain function - IMPLEMENT_APP could then define such a function instead on that platform.

Yes! You can do away with main.
Disclaimer: You asked if it were possible, not if it should be done. This is a totally un-supported, bad idea. I've done this myself, for reasons that I won't get into, but I am not recommending it. My purpose wasn't getting rid of main, but it can do that as well.
The basic steps are as follows:
Find crt0.c in your compiler's CRT source directory.
Add crt0.c to your project (a copy, not the original).
Find and remove the call to main from crt0.c.
Getting it to compile and link can be difficult; How difficult depends on which compiler and which compiler version.
Added
I just did it with Visual Studio 2008, so here are the exact steps you have to take to get it to work with that compiler.
Create a new C++ Win32 Console Application (click next and check Empty Project).
Add new item.. C++ File, but name it crt0.c (not .cpp).
Copy contents of C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\crt\src\crt0.c and paste into crt0.c.
Find mainret = _tmain(__argc, _targv, _tenviron); and comment it out.
Right-click on crt0.c and select Properties.
Set C/C++ -> General -> Additional Include Directories = "C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\crt\src".
Set C/C++ -> Preprocessor -> Preprocessor Definitions = _CRTBLD.
Click OK.
Right-click on the project name and select Properties.
Set C/C++ -> Code Generation -> Runtime Library = Multi-threaded Debug (/MTd) (*).
Click OK.
Add new item.. C++ File, name it whatever (app.cpp for this example).
Paste the code below into app.cpp and run it.
(*) You can't use the runtime DLL, you have to statically link to the runtime library.
#include <iostream>
class App
{
public: App()
{
std::cout << "Hello, World! I have no main!" << std::endl;
}
};
static App theApp;
Added
I removed the superflous exit call and the blurb about lifetime as I think we're all capable of understanding the consequences of removing main.
Ultra Necro
I just came across this answer and read both it and John Dibling's objections below. It was apparent that I didn't explain what the above procedure does and why that does indeed remove main from the program entirely.
John asserts that "there is always a main" in the CRT. Those words are not strictly correct, but the spirit of the statement is. Main is not a function provided by the CRT, you must add it yourself. The call to that function is in the CRT provided entry point function.
The entry point of every C/C++ program is a function in a module named 'crt0'. I'm not sure if this is a convention or part of the language specification, but every C/C++ compiler I've come across (which is a lot) uses it. This function basically does three things:
Initialize the CRT
Call main
Tear down
In the example above, the call is _tmain but that is some macro magic to allow for the various forms that 'main' can have, some of which are VS specific in this case.
What the above procedure does is it removes the module 'crt0' from the CRT and replaces it with a new one. This is why you can't use the Runtime DLL, there is already a function in that DLL with the same entry point name as the one we are adding (2). When you statically link, the CRT is a collection of .lib files, and the linker allows you to override .lib modules entirely. In this case a module with only one function.
Our new program contains the stock CRT, minus its CRT0 module, but with a CRT0 module of our own creation. In there we remove the call to main. So there is no main anywhere!
(2) You might think you could use the runtime DLL by renaming the entry point function in your crt0.c file, and changing the entry point in the linker settings. However, the compiler is unaware of the entry point change and the DLL contains an external reference to a 'main' function which you're not providing, so it would not compile.

Generally speaking, an application needs an entry point, and main is that entry point. The fact that initialization of globals might happen before main is pretty much irrelevant. If you're writing a console or GUI app you have to have a main for it to link, and it's only good practice to have that routine be responsible for the main execution of the app rather than use other features for bizarre unintended purposes.

Well, from the perspective of the C++ standard, yes, it's still required. But I suspect your question is of a different nature than that.
I think doing it the way you're thinking about would cause too many problems though.
For example, in many environments the return value from main is given as the status result from running the program as a whole. And that would be really hard to replicate from a constructor. Some bit of code could still call exit of course, but that seems like using a goto and would skip destruction of anything on the stack. You could try to fix things up by having a special exception you threw instead in order to generate an exit code other than 0.
But then you still run into the problem of the order of execution of global constructors not being defined. That means that in any particular constructor for a global object you won't be able to make any assumptions about whether or not any other global object yet exists.
You could try to solve the constructor order problem by just saying each constructor gets its own thread, and if you want to access any other global objects you have to wait on a condition variable until they say they're constructed. That's just asking for deadlocks though, and those deadlocks would be really hard to debug. You'd also have the issue of which thread exiting with the special 'return value from the program' exception would constitute the real return value of the program as a whole.
I think those two issues are killers if you want to get rid of main.
And I can't think of a language that doesn't have some basic equivalent to main. In Java, for example, there is an externally supplied class name who's main static function is called. In Python, there's the __main__ module. In perl there's the script you specify on the command line.

If you have more than one global object being constructed, there is no guarantee as to which constructor will run first.

If you are building static or dynamic library code then you don't need to define main yourself, but you will still wind up running in some program that has it.

If you are coding for windows, do not do this.
Running your app entirely from within the constructor of a global object may work just fine for quite awhile, but sooner or later you will make a call to the wrong function and end up with a program that terminates without warning.
Global object constructors run during the startup of the C runtime.
The C runtime startup code runs during the DLLMain of the C runtime DLL
During DLLMain, you are holding the DLL loader lock.
Tring to load another DLL while already holding the DLL loader lock results in a swift death for your process.
Compiling your entire app into a single executable won't save you - many Win32 calls have the potential to quietly load system DLLs.

There are implementations where global objects are not possible, or where non-trivial constructors are not possible for such objects (especially in the mobile and embedded realms).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Linker removes unreferenced code - c++

Related

Code coverage missing a branch if a function takes reference parameter

How does it work and compile a C++ extension of TCL with a Macro and no main function

how to start the execution of a program in c/c++ from a different function,but not main() [duplicate]

Get a function declaration from another llvm::Module

Do you really need a main() in C++?

Categories

Resources