How library classes are instantiated - c++

I'm going to ask how it's done in c++, but this idea can apply to multiple languages. If you know how to do it in objective-c as well, please provide any similarities between the two
Lets say I want to create an instance of an ofstream like
ofstream myfile;
I'm assuming all I have on my computer is the *.o file (in a library archive) and the *.h file for iostream class. If this part isn't true let me know. I am assuming this when all I have installed is the runtime and the devel packages, not the source files.
How does it connect the header file to the object file, is there a naming scheme. And where does it look and in what order.?
Why this is confusing me is normally when I want to create a class I link my implementation of the class with the program, so where does it now and how does it now to link the files?
One more, does it matter if it loaded statically or dynamically?
Thank you in advance, and sry if this is a silly question.

Computer Science 101:
Broadly speaking (VERY broadly!), there are two kinds of "programs":
a) Interpreted: you read the program source line-by-line every time you execute it
<= *nix shell scripts and DOS .bat files are "interpeted"
b) Compiled: you read the source once (to convert it into a "binary machine code"). You link the machine code "object files" to build an "executable program".
You're talking about "compiled programs"
The "ofstream" part is irrelevant once the program is "compiled"
The binary implementation for "ofstream" can be compiled directly into the executable, or it can be dynamically loaded from a shared library (.dll) at runtime.
A "compiler" users ".h" headers to process the source file.
A "linker" uses ".lib" libraries to match symbols and link static code at link type.
The "Operating System" recognizes dynamic links and loads the needed shared libraries (.dll's) at runtime.
Three different things, all independent of each other: Compiler/source code, Linker/machine object code, OS/executable programs
'Hope that helps .. a bit...

This is not standardized and it's up to the implementation. I don't know about *unix, but I assume it's fairly similar to Windows.
You can assume that .o files are similar to library files .lib.
The header does define the class definition, so that the linker knows what to look for in the library.
Say you have a header:
class A
{
public:
A();
void foo();
};
and a lib file A.lib.
You include that header and call:
A a;
a.foo();
The compiler finds the declarations for bot A() and A::foo(). Now it knows it has to search the library for these functions. Names in the library are decorated, and contain modifiers, but its specific to the compiler so the linker finds the functions if they are exported in the library. It then binds the functions to the specific entry point from the dll.
If by dynamic loading you mean using LoadModule() and GetProcAddress() instead of linking, than the concept is pretty similar.

If you do static linking all symbols with linkage are available in the .obj file. The linker binds the calls of the functions to the entry points of the functions. There is a name mangeling involved in this process so that the symbols can be resolved correctly.
Dynamic linking is a platform dependent issue and not part of the C or C++ standard as far as I know.

Related

How Iostream file is located in computer by c++ code during execution

i want to know that in a c++ code during execution how iostream file is founded. we write #include in c++ program and i know about #include which is a preprocessor directive
to load files and is a file name but i don't know that how that file is located.
i have some questions in my mind...
Is Standard library present in compiler which we are using?
Is that file is present in standard library or in our computer?
Can we give directory path to locate the file through c++ code if yes then how?
You seem to be confused about the compilation and execution model of C++. C++ is generally not interpreted (though it could) but instead a native binary is produced during the compilation phase, which is then executed... So let's take a detour.
In order to go from a handful of text files to a program being executed, there are several steps:
compilation
link
load
execution proper
I will only describe what traditional compilers do (such as gcc or clang), potential variations will be indicated later on.
Compiling
During compilation, each source file (generally .cpp or .cxx though the compiler could care less) is processed to produced an object file (generally .o on Linux):
The source file is preprocessed: this means resolving the #include (copy/pasting the included file in the current file), navigating the #if and #else to remove unneeded sources and expanding macros.
The preprocessed file is fed to the compiler which will produce native code for each function, static or global variable, etc... the format depends on the target system in general
The compiler outputs the object file (it's a binary format, in general)
Linking
In this phase, multiple object files are assembled together into a library or an executable. In the case of a static library or a statically-linked executable, the libraries it depend on are also assembled in the produced file.
Traditionally, the linker job is relatively simple: it just concatenates all object files, which already contain the binary format the target machine can execute. However it often actually does more: in C and C++ inline functions are duplicated across object files, so the linker need to keep only one of the definitions, for example.
At this point, the program is "compiled", and we live the realm of the compiler.
Loading
When you ask to execute a program, the OS will load its code into memory (thanks to a loader) and execute it.
In the case of a statically linked executable, this is easy: it's just a single big blob of code that need be loaded. In the case of a dynamically linked executable, it implies finding the dependencies and "resolving the symbols", I'll describe this below:
First of all, your dynamically linked executable and libraries have a section describing which other libraries they depend on. They only have the name of the library, not its exact location, so the loader will search among a list of paths (LD_LIBRARY_PATH for example on Linux) for the libraries and actually load them.
When loading a library, the loader will perform replacements. Your executable had placeholders saying "Here should be the address of the printf function", and the loader will replace that placeholder with the actual address.
Once everything is loaded properly, all symbols should be resolved. If some symbols are missing, ie the code for them is not found in either the present library or any of its dependencies, then you generally get an error (either immediately, or only when the symbol is actually needed if you use lazy loading).
Executing
The code (assembler instruction in binary format) is now executed. In C++, this starts with building the global and static objects (at file scope, not function-static), and then goes on to calling your main function.
Caveats: this is a simplified view, nowadays Link-Time Optimizations mean that the linker will do more and more, the loader could actually perform optimizations too and as mentioned, using lazy loading, the loader might be invoked after the execution started... still, you've got to start somewhere, don't you ?
So, what does it mean about your question ?
The #include <iostream> in your source code is a pre-processor directive. It is thus fully resolved early in the compilation phase, and only depends on finding the appropriate header (no library code is actually needed yet). Note: a compiler would be allowed not to have a header file sitting around, and just magically inject the necessary code as if the header file existed, because this is a Standard Library header (thus special). For regular headers (yours) the pre-processor is invoked.
Then, at link-time:
if you use static linking, the linker will search for the Standard Library and include it in the executable it produces
if you use dynamic linking, the linker will note that it depends on the Standard Library file (libc++.so for example) and that the produced code is missing an implementation of printf (for example)
Then, at load time:
if you used dynamic linking, the loader will search for the Standard Library and load its code
Then, at execution time, the code (yours and its dependencies) is finally executed.
Several Standard Library implementations exist, off the top of my head:
MSVC ships with a modified version of Dirkumware
gcc ships with libstdc++ (which depends on libc)
clang ships with libc++ (which depends on libc), but may use libstdc++ instead (with compiler flags)
And of course, you could provide others... probably... though setting it up might not be easy.
Which is ultimately used depends on the compiler options you use. By default the most common compiler ship with their own implementation and use it without any intervention on your part.
And finally yes, you can indeed specify paths in a #include directive. For example, using boost:
#include <boost/optional.hpp>
#include <boost/algorithm/string/trim.hpp>
you can even use relative paths:
#include <../myotherproject/x.hpp>
though this is considered poor form by some (since it breaks as soon as your reorganize your files).
What matters is that the pre-processor will look through a list of directories, and for each directory append / and the path you specified. If this creates the path of an existing file, it picks it, otherwise it continues to the next directory... until it runs out (and complain).
The <iostream> file is just not needed during execution. But that's just the header. You do need the standard library, but that's generally named differently if not included outright into your executable.
The C++ Standard Library doesn't ship with your OS, although on many Linux systems the line between OS and common libraries such as the C++ Standard Library is a bit thin.
How you locate that library is very much OS dependent.
There can be 2 ways to load a header file (like iostream.h) in C++
if you write the code as:
# include <iostream>
It will look up the header file in include directory of your C++ compiler and will load it
Other way you can give the full path of the header file as:
# include "path_of_file.h"
And loading up the file is OS dependent as answered by MSalters
You definitely required the standard library header files so that pre-processor directive can locate them.
Yes those files are present in the library and on include copied to our code.
if we had defined the our own header file then we have to give path of that file. In that way we can include also *.c or *.cpp along with the header files in which we had defined various methods and those had to include at pre-processing time.

why do I need to link a lib file to my project?

I am creating a project that uses a DLL. To build my project, I need to include a header file, and a lib file. Why do I need to include the respective lib file? shouldn't the header file declare all the needed information and then at runtime load any needed library/dll?
Thanks
In many other languages, the equivalent of the header file is all you need. But the common C linkers on Windows have always used import libraries, C++ linkers followed suit, and it's probably too late to change.
As a thought experiment, one could imagine syntax like this:
__declspec(dllimport, "kernel32") void __stdcall Sleep(DWORD dwMilliseconds);
Armed with that information the compiler/linker tool chain could do the rest.
As a further example, in Delphi one would import this function, using implicit linking, like so:
procedure Sleep(dwMilliseconds: DWORD); stdcall; external 'kernel32';
which just goes to show that import libraries are not, a priori, essential for linking to DLLs.
That is a so-called "import library" that contains minimal wiring that will later (at load time) ask the operating system to load the DLL.
DLLs are a Windows (MS/Intel) thing. The (generated) lib contains the code needed to call into the DLL and it exposes 'normal' functions to the rest of your App.
No, the header file isn't necassarily enough. The header file can contain just the declarations of the functions and classes and other things you need, not their implementations.
There is a world of difference between this code:
void Multiply(int x, int y);
and this code:
void Multiply(int x, int y)
{
return x * y;
}
The first is a declaration, and the second is a definition or implementation. Usually the first example is put in header files, and the second one is put in .CPP files (If you are creating libraries). If you included a header with the first and didn't link in anything, how is your application supposed to know how to implement Multiply?
Now if you are using header files that contain code that is ALL inlined, then you do not need to link anything. But if even one method is NOT inlined, but has its implementation in a .CPP file that is compiled to a .lib file, than you need to link in the .lib file.
[EDIT]
With your use of Import Libraries, you are telling the linker to NOT include the implementation details of the imported code into your binary. Instead the OS will then load the import DLL at run-time into your process. This will make your application smaller, but you have to ship another DLL with it. If the implementation of the library changes, you can just reship another DLL to your customers, and not have to reship the entire application.
There is another option where you can just link in a library and you don't need to ship another DLL. That option is where the Linker will include the implementation into your application, making it bigger in size. If you have to change the implementation details in the imported library, then you have to recompile and relink your entire application, and reship the entire thing to your customers.
There are two relevant phases in the building process here:
compilation: from the source code to an object file. During the compilation, the compiler needs to know what external things are available, for that one needs a declaration. Declarations designed to be used in several compilation units are grouped in header. So you need the headers for the library.
linking: For static libraries, you need the compiled version of the library. For dynamic libraries, in Unix you need the library, in windows, you need the "import library".
You could think that a library could also embed the declarations or the header could include the library which needs to be linked. The first is often done in other languages. The second is sometimes available through pragmas in C and C++, but there is no standard way to do this and would be in conflict with common usage (such as choosing a library among several which provide code variant for the same declarations, for instance debug/release single thread/multithreads). And neither choice correspond well with the simple compilation model of C and C++ which has its roots in the 60's.
The header file is consumed by the compiler. It contains all the forward declarations of functions, classes and global variables that will be used. It may also contain some inline function definitions as well.
These are used by the compiler to give it the bare minimum information that it needs to compile your code. It will not contain the implementation details.
However you still need to link in all the function, and variable definitions that you have told the compiler about. Failure to do so will result in a linker error. Often this is contains in other object files which may be joined into a single static library.
In the case of DLLs (or .so files), we still need to tell the linker where in the DLL or shared object the missing symbols are. On windows, this information is contained in a .lib file. This will generate the code to load and link the code at runtime.
On unix the the dll and lib files are combined into a single .so file which you must link against to about linker errors.
You can still use a dll without a .lib file but you will then have to load and link in all the symbols manually using operating system APIs.
from 1000 ft, the lib contains the list of the functions that dll exports and addresses that are needed for the call.

Calling C/C++ functions in dynamic and static libraries in D

I'm having trouble wrapping my head around how to interface with C/C++ libraries, both static (.lib/.a) and dynamic (.dll/.so), in D. From what I understand, it's possible to tell the DMD compiler to link with .lib files, and that you can convert .dll files to .lib with the implib tool that Digital Mars provides. In addition, I've come across this page, which implies being able to call functions in .dlls by converting C header files to D interface files. Are both of these methods equivalent? Would these same methods work for Unix library files? Also, how would one reference functions, enums, etc from these libraries, and how would one tell their D compiler to link with these libs (I'm using VisualD, specifically)? If anyone could provide some examples of referencing .lib, .dll, .a, and .so files from D code, I'd be most grateful.
Note you are dealing with three phases for generating an executable. During compilation you are creating object files (.lib/.a are just archives of object files). Once these files are created you use a Linker to put all the pieces together. When dealing with dynamic libraries (.dll, .so) there is the extra step of loading the library when the program starts/during run-time.
During compilation the compiler only needs to be aware of what you are using, it doesn't care if it is implemented. This is where the D interface files come in and are kind of equivalent to Header Files in this respect. Enumerations are declared in the D interface file and must also be defined because they only exist at compile time. Functions and variables can just be declared with no body.
int myFunction(char* str);
The guide for converting a header file to D is in the page you referenced. These files can then be passed to the compiler or exist in the Include Path.
When the linker runs is when you'll need the .lib/.a file. These files can be passed to the compiler which will forward them to the Linker or you can use pragma(lib, "my.lib"); in your program. In both cases the linker must be able to finding at link time (compilation).
In Linux I don't believe there is a difference for linking dynamic and static. In Windows you don't even need the D interface file. Instead you must obtain the function through system calls. I'm really not that familiar with this area, but I suggest Loading Plugins (DLLs) on-the-fly
Update: I can't help much with VisualD, but there is D for .NET Programmers.
There samples in D distribution of how to do this.
You need to define thunk module like this:
module harmonia.native.win32;
version(build) { pragma(nolink); }
export int DialogBoxParamA(HINSTANCE hInstance, LPCSTR lpTemplateName,
HWND hWndParent, DLGPROC lpDialogFunc, LPARAM dwInitParam);
and include import libs of DLLs where functions like DialogBoxParamA are defined.

LNK4221 and LNK4006 Warnings

I am making a static library of my own. I have taken my code which works and now put it into a static library for another program to use. In my library I am using another static library which I don't want the people who will be using my API to know. Since, I want to hide that information from them I can't tell them to install the other static library.
Anyway, I used the command line Lib.exe to extract and create a smaller lib file of just the obj's I used. However, I get a bunch of LNK4006 :second definition ignored linker warnings for each obj I use followed by LNK4221 no public symbols found;archive member will be inaccessible.
I am doing this work in vs2008 and I am not sure what I am doing wrong.
I am using the #pragma comment line in my .cpp file
I have also modified the librarian to add my smaller .lib along with its location.
my code simply makes calls to a couple functions which it should be able to get from those Obj file in the smaller lib.
All my functions are implemented in .cpp file and my header just have the includes of the third party header files and come standard c++ header files. nothing fancy. I have actually no function definitions in there atm. I was going to put the API definition in there and implement that in the .cpp for this static lib that i was going to make. However, I just wanted to build my code before I added more to it.
I did read http://support.microsoft.com/default.aspx?scid=kb;EN-US;815773 but it did not provide a solution.
Even if you extract all objects from the other library and put them in your own library, your users will still be able to see what's in your library and thus see all the object names. In many cases the names of the objects will reveal what's actually the other library you are using.
Instead of distributing your library as a static library, consider distributing it as a DLL. In the DLL you can easily hide all the underlying things and only make public what you want to make public.

static link library

I am writing a hello world c++ application, in the instruction #include help the compiler or linker to import the c++ library. My " cout << "hello world"; " use a cout in the library. The question is after compile and generated exe is about 96k in size, so what instructions are actually contained in this exe file, does this file also contains the iostream library?
Thanks
In the general case, the linker will only bring in what it needs. Once the compiler phase has turned your source code into an object file, it's treated much the same as all other object files. You have:
the C start-up code which prepares the execution environment (sets up argv, argv and so on) then calls your main or equivalent.
your code itself.
whatever object files need to be dragged in from libraries (dynamic linking is a special case of linking that happens at runtime and I won't cover that here since you asked specifically about static linking).
The linker will include all the object files you explicitly specify (unless it's a particularly smart linker and can tell you're not using the object file).
With libraries, it's a little different. Basically, you start with a list of unresolved symbols (like cout). The linker will search all the object files in all the libraries you specify and, when it finds an object file that satisfies that symbol, it will drag it in and fix up the symbol references.
This may, of course, add even more unresolved symbols if, for example, there was something in the object file that relies on the C printf function (unlikely but possible).
The linker continues like this until all symbols are satisfied (when it gives you an executable) or one cannot be satisfied (when it complains to you bitterly about your coding practices).
So as to what is in your executable, it may be the entire iostream library or it may just be the minimum required to do what you asked. It will usually depend on how many object files the iostream library was built into.
I've seen code where an entire subsystem went into one object file so, that if you wanted to just use one tiny bit, you still got the lot. Alternatively, you can put every single function into its own object file and the linker will probably create an executable as small as possible.
There are options to the linker which can produce a link map which will show you how things are organised. You probably won't generally see it if you're using the IDE but it'll be buried deep within the compile-time options dialogs under MSVC.
And, in terms of your added comment, the code:
cout << "hello";
will quite possibly bring in sizeable chunks of both the iostream and string processing code.
Use cl /EHsc hello.cpp -link /MAP. The .map file generated will give you a rough idea which pieces of the static library are present in the .exe.
Some of the space is used by C++ startup code, and the portions of the static library that you use.
In windows, the library or part of the libraries (which are used) are also usually included in the .exe, the case is different in case of Linux. However, there are optimization options.
I guess this Wiki link will be useful : Static Libraries