I'm trying to understand what happens when modules with globals and static variables are dynamically linked to an application.
By modules, I mean each project in a solution (I work a lot with visual studio!). These modules are either built into *.lib or *.dll or the *.exe itself.
I understand that the binary of an application contains global and static data of all the individual translation units (object files) in the data segment (and read only data segment if const).
What happens when this application uses a module A with load-time dynamic linking? I assume the DLL has a section for its globals and statics. Does the operating system load them? If so, where do they get loaded to?
And what happens when the application uses a module B with run-time dynamic linking?
If I have two modules in my application that both use A and B, are copies of A and B's globals created as mentioned below (if they are different processes)?
Do DLLs A and B get access to the applications globals?
(Please state your reasons as well)
Quoting from MSDN:
Variables that are declared as global in a DLL source code file are treated as global variables by the compiler and linker, but each process that loads a given DLL gets its own copy of that DLL's global variables. The scope of static variables is limited to the block in which the static variables are declared. As a result, each process has its own instance of the DLL global and static variables by default.
and from here:
When dynamically linking modules, it can be unclear whether different libraries have their own instances of globals or whether the globals are shared.
Thanks.
This is a pretty famous difference between Windows and Unix-like systems.
No matter what:
Each process has its own address space, meaning that there is never any memory being shared between processes (unless you use some inter-process communication library or extensions).
The One Definition Rule (ODR) still applies, meaning that you can only have one definition of the global variable visible at link-time (static or dynamic linking).
So, the key issue here is really visibility.
In all cases, static global variables (or functions) are never visible from outside a module (dll/so or executable). The C++ standard requires that these have internal linkage, meaning that they are not visible outside the translation unit (which becomes an object file) in which they are defined. So, that settles that issue.
Where it gets complicated is when you have extern global variables. Here, Windows and Unix-like systems are completely different.
In the case of Windows (.exe and .dll), the extern global variables are not part of the exported symbols. In other words, different modules are in no way aware of global variables defined in other modules. This means that you will get linker errors if you try, for example, to create an executable that is supposed to use an extern variable defined in a DLL, because this is not allowed. You would need to provide an object file (or static library) with a definition of that extern variable and link it statically with both the executable and the DLL, resulting in two distinct global variables (one belonging to the executable and one belonging to the DLL).
To actually export a global variable in Windows, you have to use a syntax similar to the function export/import syntax, i.e.:
#ifdef COMPILING_THE_DLL
#define MY_DLL_EXPORT extern "C" __declspec(dllexport)
#else
#define MY_DLL_EXPORT extern "C" __declspec(dllimport)
#endif
MY_DLL_EXPORT int my_global;
When you do that, the global variable is added to the list of exported symbols and can be linked like all the other functions.
In the case of Unix-like environments (like Linux), the dynamic libraries, called "shared objects" with extension .so export all extern global variables (or functions). In this case, if you do load-time linking from anywhere to a shared object file, then the global variables are shared, i.e., linked together as one. Basically, Unix-like systems are designed to make it so that there is virtually no difference between linking with a static or a dynamic library. Again, ODR applies across the board: an extern global variable will be shared across modules, meaning that it should have only one definition across all the modules loaded.
Finally, in both cases, for Windows or Unix-like systems, you can do run-time linking of the dynamic library, i.e., using either LoadLibrary() / GetProcAddress() / FreeLibrary() or dlopen() / dlsym() / dlclose(). In that case, you have to manually get a pointer to each of the symbols you wish to use, and that includes the global variables you wish to use. For global variables, you can use GetProcAddress() or dlsym() just the same as you do for functions, provided that the global variables are part of the exported symbol list (by the rules of the previous paragraphs).
And of course, as a necessary final note: global variables should be avoided. And I believe that the text you quoted (about things being "unclear") is referring exactly to the platform-specific differences that I just explained (dynamic libraries are not really defined by the C++ standard, this is platform-specific territory, meaning it is much less reliable / portable).
The answer left by Mikael Persson, although very thorough, contains a severe error (or at least misleading), in regards to the global variables, that needs to be cleared up. The original question asked if there were seperate copies of the global variables or if global variables were shared between the processes.
The true answer is the following: There are seperate (multiple) copies of the global variables for each process, and they are not shared between processes. Thus by stating the One Definition Rule (ODR) applies is also very misleading, it does not apply in the sense they are NOT the same globals used by each process, so in reality it is not "One Definition" between processes.
Also even though global variables are not "visible" to the process,..they are always easily "accesible" to the process, because any function could easily return a value of a global variable to the process, or for that matter, a process could set a value of a global variable through a function call. Thus this answer is also misleading.
In reality, "yes" the processes do have full "access" to the globals, at the very least through the funtion calls to the library. But to reiterate, each process has it's own copy of the globals, so it won't be the same globals that another process is using.
Thus the entire answer relating to external exporting of globals really is off topic, and unnecessary and not even related to the original question. Because the globals do not need extern to be accessed, the globals can always be accessed indirectly through function calls to the library.
The only part that is shared between the processes, of course, is the actual "code". The code only loaded in one place in physical memory (RAM), but that same physical memory location of course is mapped into the "local" virtual memory locations of each process.
To the contrary, a static library has a copy of the code for each process already baked into the executable (ELF, PE, etc.), and of course, like dynamic libraries has seperate globals for each process.
In unix systems:
It is to be noted , that the linker does not complain if two dynamic libraries export same global variables. but during execution a segfault might arise depending on access violations. A usual number exhibiting this behavior would be segmentation fault 15
segfault at xxxxxx ip xxxxxx sp xxxxxxx error 15 in a.out
Related
I'm trying to understand what happens when modules with globals and static variables are dynamically linked to an application.
By modules, I mean each project in a solution (I work a lot with visual studio!). These modules are either built into *.lib or *.dll or the *.exe itself.
I understand that the binary of an application contains global and static data of all the individual translation units (object files) in the data segment (and read only data segment if const).
What happens when this application uses a module A with load-time dynamic linking? I assume the DLL has a section for its globals and statics. Does the operating system load them? If so, where do they get loaded to?
And what happens when the application uses a module B with run-time dynamic linking?
If I have two modules in my application that both use A and B, are copies of A and B's globals created as mentioned below (if they are different processes)?
Do DLLs A and B get access to the applications globals?
(Please state your reasons as well)
Quoting from MSDN:
Variables that are declared as global in a DLL source code file are treated as global variables by the compiler and linker, but each process that loads a given DLL gets its own copy of that DLL's global variables. The scope of static variables is limited to the block in which the static variables are declared. As a result, each process has its own instance of the DLL global and static variables by default.
and from here:
When dynamically linking modules, it can be unclear whether different libraries have their own instances of globals or whether the globals are shared.
Thanks.
This is a pretty famous difference between Windows and Unix-like systems.
No matter what:
Each process has its own address space, meaning that there is never any memory being shared between processes (unless you use some inter-process communication library or extensions).
The One Definition Rule (ODR) still applies, meaning that you can only have one definition of the global variable visible at link-time (static or dynamic linking).
So, the key issue here is really visibility.
In all cases, static global variables (or functions) are never visible from outside a module (dll/so or executable). The C++ standard requires that these have internal linkage, meaning that they are not visible outside the translation unit (which becomes an object file) in which they are defined. So, that settles that issue.
Where it gets complicated is when you have extern global variables. Here, Windows and Unix-like systems are completely different.
In the case of Windows (.exe and .dll), the extern global variables are not part of the exported symbols. In other words, different modules are in no way aware of global variables defined in other modules. This means that you will get linker errors if you try, for example, to create an executable that is supposed to use an extern variable defined in a DLL, because this is not allowed. You would need to provide an object file (or static library) with a definition of that extern variable and link it statically with both the executable and the DLL, resulting in two distinct global variables (one belonging to the executable and one belonging to the DLL).
To actually export a global variable in Windows, you have to use a syntax similar to the function export/import syntax, i.e.:
#ifdef COMPILING_THE_DLL
#define MY_DLL_EXPORT extern "C" __declspec(dllexport)
#else
#define MY_DLL_EXPORT extern "C" __declspec(dllimport)
#endif
MY_DLL_EXPORT int my_global;
When you do that, the global variable is added to the list of exported symbols and can be linked like all the other functions.
In the case of Unix-like environments (like Linux), the dynamic libraries, called "shared objects" with extension .so export all extern global variables (or functions). In this case, if you do load-time linking from anywhere to a shared object file, then the global variables are shared, i.e., linked together as one. Basically, Unix-like systems are designed to make it so that there is virtually no difference between linking with a static or a dynamic library. Again, ODR applies across the board: an extern global variable will be shared across modules, meaning that it should have only one definition across all the modules loaded.
Finally, in both cases, for Windows or Unix-like systems, you can do run-time linking of the dynamic library, i.e., using either LoadLibrary() / GetProcAddress() / FreeLibrary() or dlopen() / dlsym() / dlclose(). In that case, you have to manually get a pointer to each of the symbols you wish to use, and that includes the global variables you wish to use. For global variables, you can use GetProcAddress() or dlsym() just the same as you do for functions, provided that the global variables are part of the exported symbol list (by the rules of the previous paragraphs).
And of course, as a necessary final note: global variables should be avoided. And I believe that the text you quoted (about things being "unclear") is referring exactly to the platform-specific differences that I just explained (dynamic libraries are not really defined by the C++ standard, this is platform-specific territory, meaning it is much less reliable / portable).
The answer left by Mikael Persson, although very thorough, contains a severe error (or at least misleading), in regards to the global variables, that needs to be cleared up. The original question asked if there were seperate copies of the global variables or if global variables were shared between the processes.
The true answer is the following: There are seperate (multiple) copies of the global variables for each process, and they are not shared between processes. Thus by stating the One Definition Rule (ODR) applies is also very misleading, it does not apply in the sense they are NOT the same globals used by each process, so in reality it is not "One Definition" between processes.
Also even though global variables are not "visible" to the process,..they are always easily "accesible" to the process, because any function could easily return a value of a global variable to the process, or for that matter, a process could set a value of a global variable through a function call. Thus this answer is also misleading.
In reality, "yes" the processes do have full "access" to the globals, at the very least through the funtion calls to the library. But to reiterate, each process has it's own copy of the globals, so it won't be the same globals that another process is using.
Thus the entire answer relating to external exporting of globals really is off topic, and unnecessary and not even related to the original question. Because the globals do not need extern to be accessed, the globals can always be accessed indirectly through function calls to the library.
The only part that is shared between the processes, of course, is the actual "code". The code only loaded in one place in physical memory (RAM), but that same physical memory location of course is mapped into the "local" virtual memory locations of each process.
To the contrary, a static library has a copy of the code for each process already baked into the executable (ELF, PE, etc.), and of course, like dynamic libraries has seperate globals for each process.
In unix systems:
It is to be noted , that the linker does not complain if two dynamic libraries export same global variables. but during execution a segfault might arise depending on access violations. A usual number exhibiting this behavior would be segmentation fault 15
segfault at xxxxxx ip xxxxxx sp xxxxxxx error 15 in a.out
Background
First of all, I think this question goes beyond the C++ standard. The standard deals with multiple translation units (instantiation units) and thus multiple object modules, but does not seem to acknowledge the possibility of having multiple independently compiled and linked binary modules (i.e., .so files on Linux and .dll files on Windows). After all, the latter more of less enters into the world of application binary interface (ABI) that the standard leaves to implementations to consider at present.
When only a single binary module is involved, the following code snippet illustrates an elegant and portable (standard-compliant) solution to make singletons.
inline T& get() {
static T var{};
return var;
}
There are two things to note about this solution. First, the inline specifier makes the function a candidate to be included in multiple translation units, which is very convenient. Note that, the standard guarantees there is only a single instance of get() and the local static variable var in the final binary module (see here).
The second thing to note is that since C++11, initialization of static local variables is properly synchronized (see the Static local variables section here). So, concurrent invocations of get() is fine.
Now, I try to extend this solution to the case when multiple binary modules are involved. I find the following variant works with VC++ on Windows.
// dllexport is used in building the library module, and
// dllimport is used in using the library in an application module.
// Usually controlled by a macro switch.
__declspec(dllexport/dllimport) inline T& get() {
static T var{};
return var;
}
Note for non-Windows users: __declspec(dllexport) specifies that an entity (i.e., a function, a class or an object) is implemented (defined) in this module and is to be referenced by other modules. __declspec(dllimport), on the other hand, specifies that an entity is not implemented in this module and is to be found in some other module.
Since VC++ supports exporting and importing template instantiations (see here), the above solution can even be templated. For example:
template <typename T> inline
T& get() {
static T var{};
return var;
}
// EXTERN is defined to be empty in building the library module, and
// to `extern` in using the library module in an application module.
// Again, this is usually controlled by a macro switch.
EXTERN template __declspec(dllexport/dllimport) int& get<int>();
As a side note, the inline specifier is not mandatory here. See this S.O. question.
The Question
Since there is no __declspec(dllexport/import) equivalents in GCC and clang, is there a way to make a variant of the above solution that works on these two compilers?
Also, in Boost.Log, I noticed the BOOST_LOG_INLINE_GLOBAL_LOGGER_DEFAULT macro (see the Global logger objects section here). It is claimed to create singletons even if the application consists of multiple modules. If someone knows about the inner workings of this macro, explanations are welcome here.
Finally, if you know about any better solutions for making singletons, feel free to post it as an answer.
Since there is no __declspec(dllexport/import) equivalents in GCC and clang, is there a way to make a variant of the above solution that works on these two compilers?
First, this is not as much a compiler-related question but rather an underlying operating system one. GCC (and supposedly clang) do support __declspec(dllexport/import) on Windows and basically do the same as what MSVC does with the functions and object marked this way. Basically, the marked symbol is placed in a table of exported symbols from the dll (an export table). This table can be used, for instance, when you query for a symbol in a dll in run time (see GetProcAddress).
Along with the dll there comes an associated lib file that contains auxiliary data for linking your application with the dll. When you link your application with the library, the linker uses the lib file to resolve references to the dll symbols and compose the import table in your application binary. When the application starts, the OS (or rather the runtime loader component of the OS) uses the import table to find out what dlls your application depends on and what symbols it imports from those dlls. It then uses export tables in the dlls to resolve addresses of the referenced symbols in the dlls and complete the linking process.
The important side effect of this process is that only imported symbols are dynamically resolved and every symbol you dynamically link to is associated with a particular dll. You can have same-named symbols in multiple dlls and the application itself, and these symbols will refer to distinct entities as long as they are not exported. If they are exported, linking process will fail because of ambiguity. This makes process-wide singletons difficult on Windows. This also breaks some C/C++ language rules, because taking address of an object or function with external linkage (in the language terms) can produce different addresses in different parts of the program. On the other hand, the dlls are more self-contained and depend on the loading context in a lesser degree.
Things are significantly different on Linux and other POSIX-like OSs. When linked, for each shared object (which can be an so library or the application executable) a table of symbols is compiled. It lists both the symbols this shared object implements and the symbols it is missing. Additionally, the linker may embed into the shared object a list of other shared objects (optionally, with search paths) that could be used to resolve the missing symbols. The runtime loader includes a linker which loads the shared objects sequentially and constructs a global table of symbols comprising symbols from all shared objects. As that table is constructed, the duplicate symbols from multiple shared objects are resolved to a single implementation (as all implementations are considered equivalent, the first shared object in the load list that implements the symbol is used). Any missing symbols are also resolved as the subsequent shared objects in the link order are loaded.
The effect of this process is that each symbol with external linkage resolve to a single implementation in one of the shared objects, even if multiple shared objects implement it. This is more in line with the C/C++ language rules and makes it simpler to implement process-wide singletons. A simple function-local static variable, not marked in any special way, is enough.
Now, there are ways to influence the linking process, and in particular there are ways to limit the symbols that are exported from a shared object. The most common ways to do that are using symbol visibility and linker scripts. With these tools it is possible to achieve linking behavior very close to Windows, with all its pros and cons. Note that when you limit symbol visibility you do have to mark the symbols you intend to export from the shared object with the visibility attribute or pragma. There's no need to mark symbols for import though.
Also, in Boost.Log, I noticed the BOOST_LOG_INLINE_GLOBAL_LOGGER_DEFAULT macro (see the Global logger objects section here). It is claimed to create singletons even if the application consists of multiple modules. If someone knows about the inner workings of this macro, explanations are welcome here.
Boost.Log requires to be built as a shared library when it is used from a multi-module application. This makes it possible for it to have a process-wide storage of references to global loggers declared throughout the application (the storage is implemented within the Boost.Log dll/so). When you obtain a logger declared with the BOOST_LOG_INLINE_GLOBAL_LOGGER_DEFAULT or similar macro, the storage is first looked up for the reference to the logger. If it is not found, the logger is created and a reference to it is stored back to the internal storage. Otherwise the existing reference is used. Along with reference caching, this provides performance very close to a function-local static variable.
Finally, if you know about any better solutions for making singletons, feel free to post it as an answer.
Although this is not really an answer, you should generally avoid singletons. They are difficult to implement correctly and in a way that does not hamper performance. If you really do have to implement one then the solution similar to Boost.Log looks generic enough. Note however that with this solution it is generally not known which module created (and as such, 'owns') the singleton, so you can't unload any modules dynamically. There may be simpler ways that are case-specific, like exporting a function returning a reference to the local static object. If you want portability and support non-default symbol visibility by default, always explicitly export your symbols.
I'm trying to understand what happens when modules with globals and static variables are dynamically linked to an application.
By modules, I mean each project in a solution (I work a lot with visual studio!). These modules are either built into *.lib or *.dll or the *.exe itself.
I understand that the binary of an application contains global and static data of all the individual translation units (object files) in the data segment (and read only data segment if const).
What happens when this application uses a module A with load-time dynamic linking? I assume the DLL has a section for its globals and statics. Does the operating system load them? If so, where do they get loaded to?
And what happens when the application uses a module B with run-time dynamic linking?
If I have two modules in my application that both use A and B, are copies of A and B's globals created as mentioned below (if they are different processes)?
Do DLLs A and B get access to the applications globals?
(Please state your reasons as well)
Quoting from MSDN:
Variables that are declared as global in a DLL source code file are treated as global variables by the compiler and linker, but each process that loads a given DLL gets its own copy of that DLL's global variables. The scope of static variables is limited to the block in which the static variables are declared. As a result, each process has its own instance of the DLL global and static variables by default.
and from here:
When dynamically linking modules, it can be unclear whether different libraries have their own instances of globals or whether the globals are shared.
Thanks.
This is a pretty famous difference between Windows and Unix-like systems.
No matter what:
Each process has its own address space, meaning that there is never any memory being shared between processes (unless you use some inter-process communication library or extensions).
The One Definition Rule (ODR) still applies, meaning that you can only have one definition of the global variable visible at link-time (static or dynamic linking).
So, the key issue here is really visibility.
In all cases, static global variables (or functions) are never visible from outside a module (dll/so or executable). The C++ standard requires that these have internal linkage, meaning that they are not visible outside the translation unit (which becomes an object file) in which they are defined. So, that settles that issue.
Where it gets complicated is when you have extern global variables. Here, Windows and Unix-like systems are completely different.
In the case of Windows (.exe and .dll), the extern global variables are not part of the exported symbols. In other words, different modules are in no way aware of global variables defined in other modules. This means that you will get linker errors if you try, for example, to create an executable that is supposed to use an extern variable defined in a DLL, because this is not allowed. You would need to provide an object file (or static library) with a definition of that extern variable and link it statically with both the executable and the DLL, resulting in two distinct global variables (one belonging to the executable and one belonging to the DLL).
To actually export a global variable in Windows, you have to use a syntax similar to the function export/import syntax, i.e.:
#ifdef COMPILING_THE_DLL
#define MY_DLL_EXPORT extern "C" __declspec(dllexport)
#else
#define MY_DLL_EXPORT extern "C" __declspec(dllimport)
#endif
MY_DLL_EXPORT int my_global;
When you do that, the global variable is added to the list of exported symbols and can be linked like all the other functions.
In the case of Unix-like environments (like Linux), the dynamic libraries, called "shared objects" with extension .so export all extern global variables (or functions). In this case, if you do load-time linking from anywhere to a shared object file, then the global variables are shared, i.e., linked together as one. Basically, Unix-like systems are designed to make it so that there is virtually no difference between linking with a static or a dynamic library. Again, ODR applies across the board: an extern global variable will be shared across modules, meaning that it should have only one definition across all the modules loaded.
Finally, in both cases, for Windows or Unix-like systems, you can do run-time linking of the dynamic library, i.e., using either LoadLibrary() / GetProcAddress() / FreeLibrary() or dlopen() / dlsym() / dlclose(). In that case, you have to manually get a pointer to each of the symbols you wish to use, and that includes the global variables you wish to use. For global variables, you can use GetProcAddress() or dlsym() just the same as you do for functions, provided that the global variables are part of the exported symbol list (by the rules of the previous paragraphs).
And of course, as a necessary final note: global variables should be avoided. And I believe that the text you quoted (about things being "unclear") is referring exactly to the platform-specific differences that I just explained (dynamic libraries are not really defined by the C++ standard, this is platform-specific territory, meaning it is much less reliable / portable).
The answer left by Mikael Persson, although very thorough, contains a severe error (or at least misleading), in regards to the global variables, that needs to be cleared up. The original question asked if there were seperate copies of the global variables or if global variables were shared between the processes.
The true answer is the following: There are seperate (multiple) copies of the global variables for each process, and they are not shared between processes. Thus by stating the One Definition Rule (ODR) applies is also very misleading, it does not apply in the sense they are NOT the same globals used by each process, so in reality it is not "One Definition" between processes.
Also even though global variables are not "visible" to the process,..they are always easily "accesible" to the process, because any function could easily return a value of a global variable to the process, or for that matter, a process could set a value of a global variable through a function call. Thus this answer is also misleading.
In reality, "yes" the processes do have full "access" to the globals, at the very least through the funtion calls to the library. But to reiterate, each process has it's own copy of the globals, so it won't be the same globals that another process is using.
Thus the entire answer relating to external exporting of globals really is off topic, and unnecessary and not even related to the original question. Because the globals do not need extern to be accessed, the globals can always be accessed indirectly through function calls to the library.
The only part that is shared between the processes, of course, is the actual "code". The code only loaded in one place in physical memory (RAM), but that same physical memory location of course is mapped into the "local" virtual memory locations of each process.
To the contrary, a static library has a copy of the code for each process already baked into the executable (ELF, PE, etc.), and of course, like dynamic libraries has seperate globals for each process.
In unix systems:
It is to be noted , that the linker does not complain if two dynamic libraries export same global variables. but during execution a segfault might arise depending on access violations. A usual number exhibiting this behavior would be segmentation fault 15
segfault at xxxxxx ip xxxxxx sp xxxxxxx error 15 in a.out
* Question revised (see below) *
I have a cpp file that defines a static global variable e.g.
static Foo bar;
This cpp file is compiled into an executable and a shared library. The executable may load the shared library at run time.
If I am on Linux there seem to be two copies of this variable. I assume one comes from the executable and one from the shared library. Other platforms (HP, Windows) there seems to be only one copy.
What controls this behavior on Linux and can I change it? For example is there a compiler or linker flag that will force the version of this variable from the shared library to be the same as the one from the executable?
* Revision of question *
Thanks for the answers so far. On re-examining the issue it is not actually the problem stated above. The static global variable above does indeed have multiple copies on Windows, so no difference to what I see on Linux.
However, I have another global variable (not static this time) which is declared in a cpp file and as extern in a header file.
On Windows this variable has multiple copies, one in the executable and one in each dll loaded up, and on Linux it only has one. So the question is now about this difference. How can I make Linux have multiple copies?
(The logic of my program meant the value of the static global variable was dependent of the value of the non-static global variable and I started accusing the wrong variable as being the problem)
I strongly suggest you read the following. Afterwards, you will understand everything about shared libraries in Linux. As said by others, the quick answer is that the static keyword will limit the scope of the global variable to the translation unit (and thus, to the executable or shared library). Using the keyword extern in the header, and compiling a cpp containing the same global variable in only one of the modules (exe or dll/so) will make the global variable unique and shared amongst all the modules.
EDIT:
The behaviour on Windows is not the same as on Linux when you use the extern pattern because Windows' method to load dynamic link libraries (dlls) is not the same and is basically incapable of linking global variables dynamically (such that only one exists). If you can use a static loading of the DLL (not using LoadLibrary), then you can use the following:
//In one module which has the actual global variable:
__declspec(dllexport) myClass myGlobalObject;
//In all other modules:
__declspec(dllimport) myClass myGlobalObject;
This will make the myGlobalObject unique and shared amongst all modules that are using the DLL in which the first version of the above is used.
If you want each module to have its own instance of the global variable, then use the static keyword, the behaviour will be the same for Linux or Windows.
If you want one unique instance of the global variable AND require dynamic loading (LoadLibrary or dlopen), you will have to make an initialization function to provide every loaded DLL with a pointer to the global variable (before it is used). You will also have to keep a reference count (you can use a shared_ptr for that) such that you can create a new one when none exist, increment the count otherwise, and be able to delete it when the count goes to zero as DLLs are being unloaded.
The static qualifier applied to a namespace variable means that the scope of the variable is the translation unit. That means that if that variable is defined in a header and you include it from multiple .cpp files you will get a copy for each. If you want a single copy, then mark it as extern (and not static) in the header, define it in a single translation unit and link that in either the executable or the library (but not both).
What compiler did you use on each of these platforms? The behavior you're describing for Linux would be what I'd expect, the static global is only local to that particular file at compile time.
You may be able to work around your issue using the GCC visibility attribute or the visibility pragma
I don't know about HPUX, but on Windows, if you have an exe and a DLL, and they each declare global variables, then there will be two distinct variables. If you are only getting a single variable then one image must be importing the variable from the other.
I'm trying to understand what happens when modules with globals and static variables are dynamically linked to an application.
By modules, I mean each project in a solution (I work a lot with visual studio!). These modules are either built into *.lib or *.dll or the *.exe itself.
I understand that the binary of an application contains global and static data of all the individual translation units (object files) in the data segment (and read only data segment if const).
What happens when this application uses a module A with load-time dynamic linking? I assume the DLL has a section for its globals and statics. Does the operating system load them? If so, where do they get loaded to?
And what happens when the application uses a module B with run-time dynamic linking?
If I have two modules in my application that both use A and B, are copies of A and B's globals created as mentioned below (if they are different processes)?
Do DLLs A and B get access to the applications globals?
(Please state your reasons as well)
Quoting from MSDN:
Variables that are declared as global in a DLL source code file are treated as global variables by the compiler and linker, but each process that loads a given DLL gets its own copy of that DLL's global variables. The scope of static variables is limited to the block in which the static variables are declared. As a result, each process has its own instance of the DLL global and static variables by default.
and from here:
When dynamically linking modules, it can be unclear whether different libraries have their own instances of globals or whether the globals are shared.
Thanks.
This is a pretty famous difference between Windows and Unix-like systems.
No matter what:
Each process has its own address space, meaning that there is never any memory being shared between processes (unless you use some inter-process communication library or extensions).
The One Definition Rule (ODR) still applies, meaning that you can only have one definition of the global variable visible at link-time (static or dynamic linking).
So, the key issue here is really visibility.
In all cases, static global variables (or functions) are never visible from outside a module (dll/so or executable). The C++ standard requires that these have internal linkage, meaning that they are not visible outside the translation unit (which becomes an object file) in which they are defined. So, that settles that issue.
Where it gets complicated is when you have extern global variables. Here, Windows and Unix-like systems are completely different.
In the case of Windows (.exe and .dll), the extern global variables are not part of the exported symbols. In other words, different modules are in no way aware of global variables defined in other modules. This means that you will get linker errors if you try, for example, to create an executable that is supposed to use an extern variable defined in a DLL, because this is not allowed. You would need to provide an object file (or static library) with a definition of that extern variable and link it statically with both the executable and the DLL, resulting in two distinct global variables (one belonging to the executable and one belonging to the DLL).
To actually export a global variable in Windows, you have to use a syntax similar to the function export/import syntax, i.e.:
#ifdef COMPILING_THE_DLL
#define MY_DLL_EXPORT extern "C" __declspec(dllexport)
#else
#define MY_DLL_EXPORT extern "C" __declspec(dllimport)
#endif
MY_DLL_EXPORT int my_global;
When you do that, the global variable is added to the list of exported symbols and can be linked like all the other functions.
In the case of Unix-like environments (like Linux), the dynamic libraries, called "shared objects" with extension .so export all extern global variables (or functions). In this case, if you do load-time linking from anywhere to a shared object file, then the global variables are shared, i.e., linked together as one. Basically, Unix-like systems are designed to make it so that there is virtually no difference between linking with a static or a dynamic library. Again, ODR applies across the board: an extern global variable will be shared across modules, meaning that it should have only one definition across all the modules loaded.
Finally, in both cases, for Windows or Unix-like systems, you can do run-time linking of the dynamic library, i.e., using either LoadLibrary() / GetProcAddress() / FreeLibrary() or dlopen() / dlsym() / dlclose(). In that case, you have to manually get a pointer to each of the symbols you wish to use, and that includes the global variables you wish to use. For global variables, you can use GetProcAddress() or dlsym() just the same as you do for functions, provided that the global variables are part of the exported symbol list (by the rules of the previous paragraphs).
And of course, as a necessary final note: global variables should be avoided. And I believe that the text you quoted (about things being "unclear") is referring exactly to the platform-specific differences that I just explained (dynamic libraries are not really defined by the C++ standard, this is platform-specific territory, meaning it is much less reliable / portable).
The answer left by Mikael Persson, although very thorough, contains a severe error (or at least misleading), in regards to the global variables, that needs to be cleared up. The original question asked if there were seperate copies of the global variables or if global variables were shared between the processes.
The true answer is the following: There are seperate (multiple) copies of the global variables for each process, and they are not shared between processes. Thus by stating the One Definition Rule (ODR) applies is also very misleading, it does not apply in the sense they are NOT the same globals used by each process, so in reality it is not "One Definition" between processes.
Also even though global variables are not "visible" to the process,..they are always easily "accesible" to the process, because any function could easily return a value of a global variable to the process, or for that matter, a process could set a value of a global variable through a function call. Thus this answer is also misleading.
In reality, "yes" the processes do have full "access" to the globals, at the very least through the funtion calls to the library. But to reiterate, each process has it's own copy of the globals, so it won't be the same globals that another process is using.
Thus the entire answer relating to external exporting of globals really is off topic, and unnecessary and not even related to the original question. Because the globals do not need extern to be accessed, the globals can always be accessed indirectly through function calls to the library.
The only part that is shared between the processes, of course, is the actual "code". The code only loaded in one place in physical memory (RAM), but that same physical memory location of course is mapped into the "local" virtual memory locations of each process.
To the contrary, a static library has a copy of the code for each process already baked into the executable (ELF, PE, etc.), and of course, like dynamic libraries has seperate globals for each process.
In unix systems:
It is to be noted , that the linker does not complain if two dynamic libraries export same global variables. but during execution a segfault might arise depending on access violations. A usual number exhibiting this behavior would be segmentation fault 15
segfault at xxxxxx ip xxxxxx sp xxxxxxx error 15 in a.out