What is the difference between linking and binding? - c++

I was reading about the two things and got confused, what are the differences between the two?

Binding is a word that is used in more than one context. It always has to do with the connecting of one thing to another however when the act of binding happens can vary.
There is a concept of Binding Time or the point at which some component is bound to some other component. A basic list of binding time is: (1) binding at compile time, (2) binding at link time, (3) binding at load time, and (4) binding at run time.
Binding at compile time happens when the source code is compiled. For C/C++ there are two main stages, the Preprocessor which does source text replacement such as define replacement or macro replacement and the compilation of the source text which converts the source text into machine code along with the necessary instructions for the linker.
Binding at link time is when the external symbols are linked to a specific set of object files and libraries. You may have several different static libraries that have the same set of function names but the actual implementation of the function is different. So you can choose which library implementation to use by selecting different static libraries.
Binding at load time is when the loader loads the executable into memory along with any dynamic or shared libraries. The loader binds function calls to a particular dynamic or shared library and the library chosen can vary.
Binding at run time is when the program is actually running and makes choices depending on the current thread of execution.
So linking is actually just one of the types of binding. Take a look at this stackoverflow static linking vs dynamic linking which provides more information about linking and libraries.
You may also be interested in std::bind in C++ so here is a stackoverflow article std::function and std::bind what are they when they should be used.
The longer you wait before something is bound to something else can provide a needed degree of flexibility in how the software can be used. However often there is a trade off between delaying binding and run time efficiency as well as complexity of the source.
For an example of bind time consider an application that opens a file and reads from the file then closes it. You can pick a couple of different times when the file name is bound to the file open.
You might hard code the file name, binding at compile time, which means that it can only be used with that one file. To change the file name you have to change the source and recompile.
You might have the file name entered by the user such as with a user prompt or a command line argument, binding the file name to the file open at run time. To change the file name you no longer need to recompile, you can just run the program again with a different file name.

Suppose you have a function declared as:
void f(int, char);
and also as:
void f(int);
And you call the function f(4) with right signature. This is the binding.
The linker will link with the available definition of the function body for f matching with signature void f(int);

Actually both are having same meaning in the context of c programming. Some people use binding and others are use linking.
If you want ti know what linking is then here is a short explaination.
Suppose you have made a user defined function called sum() whose declaration is as under
int sum(int, int);
then whenever function is called from program, your program should know where to jump in memory to execute that function. In simple terms, called function's address should be known to your program inorder to reach to its body which is called binding.
Now sum is user defined function so it will be present in your source code itself. If it is called from main() then it will be linked to main at compile time because at that time compiler will know that where your function will be present in executable. This is called static binding.
Now think about printf() which is library function and its body is not present in your program. So when program is compiled, printf's body will not be present in your compiled executable. It will be loaded into memory when you execute your program and its address will be known to main at run time and not at compile time as it case of sum(). This type of linking is called dynamic linking.

Related

if we had a single file project that contained all the code can we not use the linker?

Linker question:
if I had a file. c that has no includes at all, would we still need a linker?
Although the linker is so-named because it links together multiple object files, it performs other functions as well. It may resolve addresses that were left incomplete by the compiler. It produces a program in an executable file format that the system’s program loader can read and load, and that format may differ from that of object modules. Specifics depend on the operating system and build tools.
Further, to have a complete program in one source file, you must provide not just the main routine you are familiar with from C and C++ but also the true start of the program, the entry point that the program loader starts execution at, and you must provide implementations for all functions you use in the program, such as invocations of system services via special trap or system-call instructions to read and write data.
You can create a project, which has no typical C startup code, in which case, you may not even have a main(). However, you still need a linker, because the linker creates the required executable file format for the given architecture.
It also will set the entrypoint, where the actual execution starts.
So you can omit the standard libraries, and create a binary, which is completly void of any C functions, but you still need the linker to actually make a runable binary.
The object file format, generated by the compiler, is very different to the executable file format, because it only provides all information, that is required for the linker.
Yes. The linker does more than merely link the files. Check out this resource for more info: https://en.wikibooks.org/wiki/C%2B%2B_Programming/Programming_Languages/C%2B%2B/Code/Compiler/Linker#:~:text=The%20linker%20is%20a%20program,translation%20unit%20have%20external%20linkage.
Believe it or not, multiple libraries can be referenced by default. So, even if you don't #includea resource, the compiler may have to internally link or reference something outside of the translation unit. There are also redundancies and other considerations that are "eliminated" by the compiler.
Despite its name the linker is properly a "linker/locater". It performs two functions - 1) linking object code, 2) determining where in memory the data and code elements exist.
The object code out of the compiler is not "located" even if it has no unresolved links.
Also even if you have the simplest possible valid code:
int main(){ return 0; }
with no includes, the linker will normally implicitly link the C runtime start-up, which is required to do everything necessary before running main(). That may be very little. On some target such as ARM Cortex-M you can in fact run C code directly from the reset vector so long as you don't assume static initialisation or complete library support. So it is possible to write the reset code entirely in C, but you probably still need code to initialise the vector table with the reset handler (your C start-up function) and the initial stack pointer. On Cortex-M that can be done using in-line assembler perhaps, but it is all rather cumbersome and unnecessary and does not forgo the linker.

Why does gdb show a different parameter order for a function

Looking through a core file(generated by C code) with gdb, I am unable to understand one particular thing between these 2 frames
#2 increment_counter (nsteps=2, steps=0x7f3fbad26790) at gconv_db.c:393
#3 find_derivation (...) at gconv_db.c:426
This code is from open source glibc where find_derivation calls increment_counter as:
result = increment_counter (*handle, *nsteps);
The *handle and steps are of the same type and increment_counter function is defined as static
Why does gdb show that the 2 parameters have different order ?
I am pretty sure that glibc was taken as is without modification
Why does gdb show that the 2 parameters have different order ?
GDB doesn't know anything about the source (except possibly where on disk it was located at build time).
It is able to display parameters (and their values) because the compiler told it (by embedding debug info into the object file) what parameters are, in what order they appear, their types, and how to compute their value.
So why would a compiler re-order function arguments?
The function is static, so it can't be called from outside of the current translation unit. Thus the compiler is free to re-order the parameters, so long as it also re-orders the arguments at every call site.
Still, why would it do that? General answer: optimization (compiler found it more convenient to pass them in this order). Detailed answer would require digging into GCC (or whatever compiler was used to build this code) source.

C++ to evaluate inclusion file during runtime

What I need to do is to "fine tune" some constant values that should be compiled along with the rest of the program, but I want to verify the results at every change without having to modify a value and recompile the whole program each time. So I was thinking at a sort of plain text configuration file to reload every time I change a number in it, and re-initialize part of the program to take action on the new values. It's something that I do often, but this time what I want to do is to have this configuration file under the form of a valid inclusion file with the following syntax:
const MyStructure[] =
{
{ 1, 0.5f, 0.2f, 0.77f, [other values...] },
{ 3, 0.4f, 0.1f, 0.15f, [other values...] },
[other rows...]
};
If I were using an interpreted language such as Perl, I'd have used the eval() function, which if course is not possible with C++. And while I have read other questions about the possiblity to have an eval() function in C++, what I want is not to evaluate and run this code, just to parse it and put the values in the variables they belong to.
I would probably use a Regular Expression to parse the C syntax above, but again, RegExp still is not something worth using in C++, so can you suggest an alternative method?
It's probably worth saying that I need to parse this file only during the development phase. I will #include it when the program is ready for the release.
Writing your own parser is probably more work than is appropriate for this use case.
A simpler solution would be to just compile the file containing the variables separately, as a shared object or DLL, which can be loaded dynamically at run time. (Precise details depend on your OS.) You could, if desired, invoke the compiler during program initialisation as well.
If you don't want to deal with the complication of finding the symbols and copying them into static variables, you could also compile the bulk of your program as a shared object, with only a small shim as the main executable. That shim would:
If necessary, invoke the compiler to create the data shared object
Dynamically load the data shared object
Dynamically load the program shared object, and
Invoke the main program using it's main entry point (possibly using a different name).
To produce the production version, it is only necessary to compile program and data together, and use it directly without the shim.
Variations on this theme are possible, depending on precise needs.

Do you really need a main() in C++?

From what I can tell you can kick off all the action in a constructor when you create a global object. So do you really need a main() function in C++ or is it just legacy?
I can understand that it could be considered bad practice to do so. I'm just asking out of curiosity.
If you want to run your program on a hosted C++ implementation, you need a main function. That's just how things are defined. You can leave it empty if you want of course. On the technical side of things, the linker wants to resolve the main symbol that's used in the runtime library (which has no clue of your special intentions to omit it - it just still emits a call to it). If the Standard specified that main is optional, then of course implementations could come up with solutions, but that would need to happen in a parallel universe.
If you go with the "Execution starts in the constructor of my global object", beware that you set yourself up to many problems related to the order of constructions of namespace scope objects defined in different translation units (So what is the entry point? The answer is: You will have multiple entry points, and what entry point is executed first is unspecified!). In C++03 you aren't even guaranteed that cout is properly constructed (in C++0x you have a guarantee that it is, before any code tries to use it, as long as there is a preceeding include of <iostream>).
You don't have those problems and don't need to work around them (wich can be very tricky) if you properly start executing things in ::main.
As mentioned in the comments, there are however several systems that hide main from the user by having him tell the name of a class which is instantiated within main. This works similar to the following example
class MyApp {
public:
MyApp(std::vector<std::string> const& argv);
int run() {
/* code comes here */
return 0;
};
};
IMPLEMENT_APP(MyApp);
To the user of this system, it's completely hidden that there is a main function, but that macro would actually define such a main function as follows
#define IMPLEMENT_APP(AppClass) \
int main(int argc, char **argv) { \
AppClass m(std::vector<std::string>(argv, argv + argc)); \
return m.run(); \
}
This doesn't have the problem of unspecified order of construction mentioned above. The benefit of them is that they work with different forms of higher level entry points. For example, Windows GUI programs start up in a WinMain function - IMPLEMENT_APP could then define such a function instead on that platform.
Yes! You can do away with main.
Disclaimer: You asked if it were possible, not if it should be done. This is a totally un-supported, bad idea. I've done this myself, for reasons that I won't get into, but I am not recommending it. My purpose wasn't getting rid of main, but it can do that as well.
The basic steps are as follows:
Find crt0.c in your compiler's CRT source directory.
Add crt0.c to your project (a copy, not the original).
Find and remove the call to main from crt0.c.
Getting it to compile and link can be difficult; How difficult depends on which compiler and which compiler version.
Added
I just did it with Visual Studio 2008, so here are the exact steps you have to take to get it to work with that compiler.
Create a new C++ Win32 Console Application (click next and check Empty Project).
Add new item.. C++ File, but name it crt0.c (not .cpp).
Copy contents of C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\crt\src\crt0.c and paste into crt0.c.
Find mainret = _tmain(__argc, _targv, _tenviron); and comment it out.
Right-click on crt0.c and select Properties.
Set C/C++ -> General -> Additional Include Directories = "C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\crt\src".
Set C/C++ -> Preprocessor -> Preprocessor Definitions = _CRTBLD.
Click OK.
Right-click on the project name and select Properties.
Set C/C++ -> Code Generation -> Runtime Library = Multi-threaded Debug (/MTd) (*).
Click OK.
Add new item.. C++ File, name it whatever (app.cpp for this example).
Paste the code below into app.cpp and run it.
(*) You can't use the runtime DLL, you have to statically link to the runtime library.
#include <iostream>
class App
{
public: App()
{
std::cout << "Hello, World! I have no main!" << std::endl;
}
};
static App theApp;
Added
I removed the superflous exit call and the blurb about lifetime as I think we're all capable of understanding the consequences of removing main.
Ultra Necro
I just came across this answer and read both it and John Dibling's objections below. It was apparent that I didn't explain what the above procedure does and why that does indeed remove main from the program entirely.
John asserts that "there is always a main" in the CRT. Those words are not strictly correct, but the spirit of the statement is. Main is not a function provided by the CRT, you must add it yourself. The call to that function is in the CRT provided entry point function.
The entry point of every C/C++ program is a function in a module named 'crt0'. I'm not sure if this is a convention or part of the language specification, but every C/C++ compiler I've come across (which is a lot) uses it. This function basically does three things:
Initialize the CRT
Call main
Tear down
In the example above, the call is _tmain but that is some macro magic to allow for the various forms that 'main' can have, some of which are VS specific in this case.
What the above procedure does is it removes the module 'crt0' from the CRT and replaces it with a new one. This is why you can't use the Runtime DLL, there is already a function in that DLL with the same entry point name as the one we are adding (2). When you statically link, the CRT is a collection of .lib files, and the linker allows you to override .lib modules entirely. In this case a module with only one function.
Our new program contains the stock CRT, minus its CRT0 module, but with a CRT0 module of our own creation. In there we remove the call to main. So there is no main anywhere!
(2) You might think you could use the runtime DLL by renaming the entry point function in your crt0.c file, and changing the entry point in the linker settings. However, the compiler is unaware of the entry point change and the DLL contains an external reference to a 'main' function which you're not providing, so it would not compile.
Generally speaking, an application needs an entry point, and main is that entry point. The fact that initialization of globals might happen before main is pretty much irrelevant. If you're writing a console or GUI app you have to have a main for it to link, and it's only good practice to have that routine be responsible for the main execution of the app rather than use other features for bizarre unintended purposes.
Well, from the perspective of the C++ standard, yes, it's still required. But I suspect your question is of a different nature than that.
I think doing it the way you're thinking about would cause too many problems though.
For example, in many environments the return value from main is given as the status result from running the program as a whole. And that would be really hard to replicate from a constructor. Some bit of code could still call exit of course, but that seems like using a goto and would skip destruction of anything on the stack. You could try to fix things up by having a special exception you threw instead in order to generate an exit code other than 0.
But then you still run into the problem of the order of execution of global constructors not being defined. That means that in any particular constructor for a global object you won't be able to make any assumptions about whether or not any other global object yet exists.
You could try to solve the constructor order problem by just saying each constructor gets its own thread, and if you want to access any other global objects you have to wait on a condition variable until they say they're constructed. That's just asking for deadlocks though, and those deadlocks would be really hard to debug. You'd also have the issue of which thread exiting with the special 'return value from the program' exception would constitute the real return value of the program as a whole.
I think those two issues are killers if you want to get rid of main.
And I can't think of a language that doesn't have some basic equivalent to main. In Java, for example, there is an externally supplied class name who's main static function is called. In Python, there's the __main__ module. In perl there's the script you specify on the command line.
If you have more than one global object being constructed, there is no guarantee as to which constructor will run first.
If you are building static or dynamic library code then you don't need to define main yourself, but you will still wind up running in some program that has it.
If you are coding for windows, do not do this.
Running your app entirely from within the constructor of a global object may work just fine for quite awhile, but sooner or later you will make a call to the wrong function and end up with a program that terminates without warning.
Global object constructors run during the startup of the C runtime.
The C runtime startup code runs during the DLLMain of the C runtime DLL
During DLLMain, you are holding the DLL loader lock.
Tring to load another DLL while already holding the DLL loader lock results in a swift death for your process.
Compiling your entire app into a single executable won't save you - many Win32 calls have the potential to quietly load system DLLs.
There are implementations where global objects are not possible, or where non-trivial constructors are not possible for such objects (especially in the mobile and embedded realms).

Compiling a DLL with gcc

Sooooo I'm writing a script interpreter. And basically, I want some classes and functions stored in a DLL, but I want the DLL to look for functions within the programs that are linking to it, like,
program dll
----------------------------------------------------
send code to dll-----> parse code
|
v
code contains a function,
that isn't contained in the DLL
|
list of functions in <------/
program
|
v
corresponding function,
user-defined in the
program--process the
passed argument here
|
\--------------> return value sent back
to the parsing function
I was wondering basically, how do I compile a DLL with gcc? Well, I'm using a windows port of gcc. Once I compile a .dll containing my classes and functions, how do I link to it with my program? How do I use the classes and functions in the DLL? Can the DLL call functions from the program linking to it? If I make a class { ... } object; in the DLL, then when the DLL is loaded by the program, will object be available to the program? Thanks in advance, I really need to know how to work with DLLs in C++ before I can continue with this project.
"Can you add more detail as to why you want the DLL to call functions in the main program?"
I thought the diagram sort of explained it... the program using the DLL passes a piece of code to the DLL, which parses the code, and if function calls are found in said code then corresponding functions within the DLL are called... for example, if I passed "a = sqrt(100)" then the DLL parser function would find the function call to sqrt(), and within the DLL would be a corresponding sqrt() function which would calculate the square root of the argument passed to it, and then it would take the return value from that function and put it into variable a... just like any other program, but if a corresponding handler for the sqrt() function isn't found within the DLL (there would be a list of natively supported functions) then it would call a similar function which would reside within the program using the DLL to see if there are any user-defined functions by that name.
So, say you loaded the DLL into the program giving your program the ability to interpret scripts of this particular language, the program could call the DLLs to process single lines of code or hand it filenames of scripts to process... but if you want to add a command into the script which suits the purpose of your program, you could say set a boolean value in the DLL telling it that you are adding functions to its language and then create a function in your code which would list the functions you are adding (the DLL would call it with the name of the function it wants, if that function is a user-defined one contained within your code, the function would call the corresponding function with the argument passed to it by the DLL, the return the return value of the user-defined function back to the DLL, and if it didn't exist, it would return an error code or NULL or something). I'm starting to see that I'll have to find another way around this to make the function calls go one way only
This link explains how to do it in a basic way.
In a big picture view, when you make a dll, you are making a library which is loaded at runtime. It contains a number of symbols which are exported. These symbols are typically references to methods or functions, plus compiler/linker goo.
When you normally build a static library, there is a minimum of goo and the linker pulls in the code it needs and repackages it for you in your executable.
In a dll, you actually get two end products (three really- just wait): a dll and a stub library. The stub is a static library that looks exactly like your regular static library, except that instead of executing your code, each stub is typically a jump instruction to a common routine. The common routine loads your dll, gets the address of the routine that you want to call, then patches up the original jump instruction to go there so when you call it again, you end up in your dll.
The third end product is usually a header file that tells you all about the data types in your library.
So your steps are: create your headers and code, build a dll, build a stub library from the headers/code/some list of exported functions. End code will link to the stub library which will load up the dll and fix up the jump table.
Compiler/linker goo includes things like making sure the runtime libraries are where they're needed, making sure that static constructors are executed, making sure that static destructors are registered for later execution, etc, etc, etc.
Now as to your main problem: how do I write extensible code in a dll? There are a number of possible ways - a typical way is to define a pure abstract class (aka interface) that defines a behavior and either pass that in to a processing routine or to create a routine for registering interfaces to do work, then the processing routine asks the registrar for an object to handle a piece of work for it.
On the detail of what you plan to solve, perhaps you should look at an extendible parser like lua instead of building your own.
To your more specific focus.
A DLL is (typically?) meant to be complete in and of itself, or explicitly know what other libraries to use to complete itself.
What I mean by that is, you cannot have a method implicitly provided by the calling application to complete the DLLs functionality.
You could however make part of your API the provision of methods from a calling app, thus making the DLL fully contained and the passing of knowledge explicit.
How do I use the classes and functions in the DLL?
Include the headers in your code, when the module (exe or another dll) is linked the dlls are checked for completness.
Can the DLL call functions from the program linking to it?
Yes, but it has to be told about them at run time.
If I make a class { ... } object; in the DLL, then when the DLL is loaded by the program, will object be available to the program?
Yes it will be available, however there are some restrictions you need to be aware about. Such as in the area of memory management it is important to either:
Link all modules sharing memory with the same memory management dll (typically c runtime)
Ensure that the memory is allocated and dealloccated only in the same module.
allocate on the stack
Examples!
Here is a basic idea of passing functions to the dll, however in your case may not be most helpfull as you need to know up front what other functions you want provided.
// parser.h
struct functions {
void *fred (int );
};
parse( string, functions );
// program.cpp
parse( "a = sqrt(); fred(a);", functions );
What you need is a way of registering functions(and their details with the dll.)
The bigger problem here is the details bit. But skipping over that you might do something like wxWidgets does with class registration. When method_fred is contructed by your app it will call the constructor and register with the dll through usage off methodInfo. Parser can lookup methodInfo for methods available.
// parser.h
class method_base { };
class methodInfo {
static void register(factory);
static map<string,factory> m_methods;
}
// program.cpp
class method_fred : public method_base {
static method* factory(string args);
static methodInfo _methoinfo;
}
methodInfo method_fred::_methoinfo("fred",method_fred::factory);
This sounds like a job for data structures.
Create a struct containing your keywords and the function associated with each one.
struct keyword {
const char *keyword;
int (*f)(int arg);
};
struct keyword keywords[max_keywords] = {
"db_connect", &db_connect,
}
Then write a function in your DLL that you pass the address of this array to:
plugin_register(keywords);
Then inside the DLL it can do:
keywords[0].f = &plugin_db_connect;
With this method, the code to handle script keywords remains in the main program while the DLL manipulates the data structures to get its own functions called.
Taking it to C++, make the struct a class instead that contains a std::vector or std::map or whatever of keywords and some functions to manipulate them.
Winrawr, before you go on, read this first:
Any improvements on the GCC/Windows DLLs/C++ STL front?
Basically, you may run into problems when passing STL strings around your DLLs, and you may also have trouble with exceptions flying across DLL boundaries, although it's not something I have experienced (yet).
You could always load the dll at runtime with load library