Related
I'm trying to understand what happens when modules with globals and static variables are dynamically linked to an application.
By modules, I mean each project in a solution (I work a lot with visual studio!). These modules are either built into *.lib or *.dll or the *.exe itself.
I understand that the binary of an application contains global and static data of all the individual translation units (object files) in the data segment (and read only data segment if const).
What happens when this application uses a module A with load-time dynamic linking? I assume the DLL has a section for its globals and statics. Does the operating system load them? If so, where do they get loaded to?
And what happens when the application uses a module B with run-time dynamic linking?
If I have two modules in my application that both use A and B, are copies of A and B's globals created as mentioned below (if they are different processes)?
Do DLLs A and B get access to the applications globals?
(Please state your reasons as well)
Quoting from MSDN:
Variables that are declared as global in a DLL source code file are treated as global variables by the compiler and linker, but each process that loads a given DLL gets its own copy of that DLL's global variables. The scope of static variables is limited to the block in which the static variables are declared. As a result, each process has its own instance of the DLL global and static variables by default.
and from here:
When dynamically linking modules, it can be unclear whether different libraries have their own instances of globals or whether the globals are shared.
Thanks.
This is a pretty famous difference between Windows and Unix-like systems.
No matter what:
Each process has its own address space, meaning that there is never any memory being shared between processes (unless you use some inter-process communication library or extensions).
The One Definition Rule (ODR) still applies, meaning that you can only have one definition of the global variable visible at link-time (static or dynamic linking).
So, the key issue here is really visibility.
In all cases, static global variables (or functions) are never visible from outside a module (dll/so or executable). The C++ standard requires that these have internal linkage, meaning that they are not visible outside the translation unit (which becomes an object file) in which they are defined. So, that settles that issue.
Where it gets complicated is when you have extern global variables. Here, Windows and Unix-like systems are completely different.
In the case of Windows (.exe and .dll), the extern global variables are not part of the exported symbols. In other words, different modules are in no way aware of global variables defined in other modules. This means that you will get linker errors if you try, for example, to create an executable that is supposed to use an extern variable defined in a DLL, because this is not allowed. You would need to provide an object file (or static library) with a definition of that extern variable and link it statically with both the executable and the DLL, resulting in two distinct global variables (one belonging to the executable and one belonging to the DLL).
To actually export a global variable in Windows, you have to use a syntax similar to the function export/import syntax, i.e.:
#ifdef COMPILING_THE_DLL
#define MY_DLL_EXPORT extern "C" __declspec(dllexport)
#else
#define MY_DLL_EXPORT extern "C" __declspec(dllimport)
#endif
MY_DLL_EXPORT int my_global;
When you do that, the global variable is added to the list of exported symbols and can be linked like all the other functions.
In the case of Unix-like environments (like Linux), the dynamic libraries, called "shared objects" with extension .so export all extern global variables (or functions). In this case, if you do load-time linking from anywhere to a shared object file, then the global variables are shared, i.e., linked together as one. Basically, Unix-like systems are designed to make it so that there is virtually no difference between linking with a static or a dynamic library. Again, ODR applies across the board: an extern global variable will be shared across modules, meaning that it should have only one definition across all the modules loaded.
Finally, in both cases, for Windows or Unix-like systems, you can do run-time linking of the dynamic library, i.e., using either LoadLibrary() / GetProcAddress() / FreeLibrary() or dlopen() / dlsym() / dlclose(). In that case, you have to manually get a pointer to each of the symbols you wish to use, and that includes the global variables you wish to use. For global variables, you can use GetProcAddress() or dlsym() just the same as you do for functions, provided that the global variables are part of the exported symbol list (by the rules of the previous paragraphs).
And of course, as a necessary final note: global variables should be avoided. And I believe that the text you quoted (about things being "unclear") is referring exactly to the platform-specific differences that I just explained (dynamic libraries are not really defined by the C++ standard, this is platform-specific territory, meaning it is much less reliable / portable).
The answer left by Mikael Persson, although very thorough, contains a severe error (or at least misleading), in regards to the global variables, that needs to be cleared up. The original question asked if there were seperate copies of the global variables or if global variables were shared between the processes.
The true answer is the following: There are seperate (multiple) copies of the global variables for each process, and they are not shared between processes. Thus by stating the One Definition Rule (ODR) applies is also very misleading, it does not apply in the sense they are NOT the same globals used by each process, so in reality it is not "One Definition" between processes.
Also even though global variables are not "visible" to the process,..they are always easily "accesible" to the process, because any function could easily return a value of a global variable to the process, or for that matter, a process could set a value of a global variable through a function call. Thus this answer is also misleading.
In reality, "yes" the processes do have full "access" to the globals, at the very least through the funtion calls to the library. But to reiterate, each process has it's own copy of the globals, so it won't be the same globals that another process is using.
Thus the entire answer relating to external exporting of globals really is off topic, and unnecessary and not even related to the original question. Because the globals do not need extern to be accessed, the globals can always be accessed indirectly through function calls to the library.
The only part that is shared between the processes, of course, is the actual "code". The code only loaded in one place in physical memory (RAM), but that same physical memory location of course is mapped into the "local" virtual memory locations of each process.
To the contrary, a static library has a copy of the code for each process already baked into the executable (ELF, PE, etc.), and of course, like dynamic libraries has seperate globals for each process.
In unix systems:
It is to be noted , that the linker does not complain if two dynamic libraries export same global variables. but during execution a segfault might arise depending on access violations. A usual number exhibiting this behavior would be segmentation fault 15
segfault at xxxxxx ip xxxxxx sp xxxxxxx error 15 in a.out
I'm trying to understand what happens when modules with globals and static variables are dynamically linked to an application.
By modules, I mean each project in a solution (I work a lot with visual studio!). These modules are either built into *.lib or *.dll or the *.exe itself.
I understand that the binary of an application contains global and static data of all the individual translation units (object files) in the data segment (and read only data segment if const).
What happens when this application uses a module A with load-time dynamic linking? I assume the DLL has a section for its globals and statics. Does the operating system load them? If so, where do they get loaded to?
And what happens when the application uses a module B with run-time dynamic linking?
If I have two modules in my application that both use A and B, are copies of A and B's globals created as mentioned below (if they are different processes)?
Do DLLs A and B get access to the applications globals?
(Please state your reasons as well)
Quoting from MSDN:
Variables that are declared as global in a DLL source code file are treated as global variables by the compiler and linker, but each process that loads a given DLL gets its own copy of that DLL's global variables. The scope of static variables is limited to the block in which the static variables are declared. As a result, each process has its own instance of the DLL global and static variables by default.
and from here:
When dynamically linking modules, it can be unclear whether different libraries have their own instances of globals or whether the globals are shared.
Thanks.
This is a pretty famous difference between Windows and Unix-like systems.
No matter what:
Each process has its own address space, meaning that there is never any memory being shared between processes (unless you use some inter-process communication library or extensions).
The One Definition Rule (ODR) still applies, meaning that you can only have one definition of the global variable visible at link-time (static or dynamic linking).
So, the key issue here is really visibility.
In all cases, static global variables (or functions) are never visible from outside a module (dll/so or executable). The C++ standard requires that these have internal linkage, meaning that they are not visible outside the translation unit (which becomes an object file) in which they are defined. So, that settles that issue.
Where it gets complicated is when you have extern global variables. Here, Windows and Unix-like systems are completely different.
In the case of Windows (.exe and .dll), the extern global variables are not part of the exported symbols. In other words, different modules are in no way aware of global variables defined in other modules. This means that you will get linker errors if you try, for example, to create an executable that is supposed to use an extern variable defined in a DLL, because this is not allowed. You would need to provide an object file (or static library) with a definition of that extern variable and link it statically with both the executable and the DLL, resulting in two distinct global variables (one belonging to the executable and one belonging to the DLL).
To actually export a global variable in Windows, you have to use a syntax similar to the function export/import syntax, i.e.:
#ifdef COMPILING_THE_DLL
#define MY_DLL_EXPORT extern "C" __declspec(dllexport)
#else
#define MY_DLL_EXPORT extern "C" __declspec(dllimport)
#endif
MY_DLL_EXPORT int my_global;
When you do that, the global variable is added to the list of exported symbols and can be linked like all the other functions.
In the case of Unix-like environments (like Linux), the dynamic libraries, called "shared objects" with extension .so export all extern global variables (or functions). In this case, if you do load-time linking from anywhere to a shared object file, then the global variables are shared, i.e., linked together as one. Basically, Unix-like systems are designed to make it so that there is virtually no difference between linking with a static or a dynamic library. Again, ODR applies across the board: an extern global variable will be shared across modules, meaning that it should have only one definition across all the modules loaded.
Finally, in both cases, for Windows or Unix-like systems, you can do run-time linking of the dynamic library, i.e., using either LoadLibrary() / GetProcAddress() / FreeLibrary() or dlopen() / dlsym() / dlclose(). In that case, you have to manually get a pointer to each of the symbols you wish to use, and that includes the global variables you wish to use. For global variables, you can use GetProcAddress() or dlsym() just the same as you do for functions, provided that the global variables are part of the exported symbol list (by the rules of the previous paragraphs).
And of course, as a necessary final note: global variables should be avoided. And I believe that the text you quoted (about things being "unclear") is referring exactly to the platform-specific differences that I just explained (dynamic libraries are not really defined by the C++ standard, this is platform-specific territory, meaning it is much less reliable / portable).
The answer left by Mikael Persson, although very thorough, contains a severe error (or at least misleading), in regards to the global variables, that needs to be cleared up. The original question asked if there were seperate copies of the global variables or if global variables were shared between the processes.
The true answer is the following: There are seperate (multiple) copies of the global variables for each process, and they are not shared between processes. Thus by stating the One Definition Rule (ODR) applies is also very misleading, it does not apply in the sense they are NOT the same globals used by each process, so in reality it is not "One Definition" between processes.
Also even though global variables are not "visible" to the process,..they are always easily "accesible" to the process, because any function could easily return a value of a global variable to the process, or for that matter, a process could set a value of a global variable through a function call. Thus this answer is also misleading.
In reality, "yes" the processes do have full "access" to the globals, at the very least through the funtion calls to the library. But to reiterate, each process has it's own copy of the globals, so it won't be the same globals that another process is using.
Thus the entire answer relating to external exporting of globals really is off topic, and unnecessary and not even related to the original question. Because the globals do not need extern to be accessed, the globals can always be accessed indirectly through function calls to the library.
The only part that is shared between the processes, of course, is the actual "code". The code only loaded in one place in physical memory (RAM), but that same physical memory location of course is mapped into the "local" virtual memory locations of each process.
To the contrary, a static library has a copy of the code for each process already baked into the executable (ELF, PE, etc.), and of course, like dynamic libraries has seperate globals for each process.
In unix systems:
It is to be noted , that the linker does not complain if two dynamic libraries export same global variables. but during execution a segfault might arise depending on access violations. A usual number exhibiting this behavior would be segmentation fault 15
segfault at xxxxxx ip xxxxxx sp xxxxxxx error 15 in a.out
I'm trying to understand what happens when modules with globals and static variables are dynamically linked to an application.
By modules, I mean each project in a solution (I work a lot with visual studio!). These modules are either built into *.lib or *.dll or the *.exe itself.
I understand that the binary of an application contains global and static data of all the individual translation units (object files) in the data segment (and read only data segment if const).
What happens when this application uses a module A with load-time dynamic linking? I assume the DLL has a section for its globals and statics. Does the operating system load them? If so, where do they get loaded to?
And what happens when the application uses a module B with run-time dynamic linking?
If I have two modules in my application that both use A and B, are copies of A and B's globals created as mentioned below (if they are different processes)?
Do DLLs A and B get access to the applications globals?
(Please state your reasons as well)
Quoting from MSDN:
Variables that are declared as global in a DLL source code file are treated as global variables by the compiler and linker, but each process that loads a given DLL gets its own copy of that DLL's global variables. The scope of static variables is limited to the block in which the static variables are declared. As a result, each process has its own instance of the DLL global and static variables by default.
and from here:
When dynamically linking modules, it can be unclear whether different libraries have their own instances of globals or whether the globals are shared.
Thanks.
This is a pretty famous difference between Windows and Unix-like systems.
No matter what:
Each process has its own address space, meaning that there is never any memory being shared between processes (unless you use some inter-process communication library or extensions).
The One Definition Rule (ODR) still applies, meaning that you can only have one definition of the global variable visible at link-time (static or dynamic linking).
So, the key issue here is really visibility.
In all cases, static global variables (or functions) are never visible from outside a module (dll/so or executable). The C++ standard requires that these have internal linkage, meaning that they are not visible outside the translation unit (which becomes an object file) in which they are defined. So, that settles that issue.
Where it gets complicated is when you have extern global variables. Here, Windows and Unix-like systems are completely different.
In the case of Windows (.exe and .dll), the extern global variables are not part of the exported symbols. In other words, different modules are in no way aware of global variables defined in other modules. This means that you will get linker errors if you try, for example, to create an executable that is supposed to use an extern variable defined in a DLL, because this is not allowed. You would need to provide an object file (or static library) with a definition of that extern variable and link it statically with both the executable and the DLL, resulting in two distinct global variables (one belonging to the executable and one belonging to the DLL).
To actually export a global variable in Windows, you have to use a syntax similar to the function export/import syntax, i.e.:
#ifdef COMPILING_THE_DLL
#define MY_DLL_EXPORT extern "C" __declspec(dllexport)
#else
#define MY_DLL_EXPORT extern "C" __declspec(dllimport)
#endif
MY_DLL_EXPORT int my_global;
When you do that, the global variable is added to the list of exported symbols and can be linked like all the other functions.
In the case of Unix-like environments (like Linux), the dynamic libraries, called "shared objects" with extension .so export all extern global variables (or functions). In this case, if you do load-time linking from anywhere to a shared object file, then the global variables are shared, i.e., linked together as one. Basically, Unix-like systems are designed to make it so that there is virtually no difference between linking with a static or a dynamic library. Again, ODR applies across the board: an extern global variable will be shared across modules, meaning that it should have only one definition across all the modules loaded.
Finally, in both cases, for Windows or Unix-like systems, you can do run-time linking of the dynamic library, i.e., using either LoadLibrary() / GetProcAddress() / FreeLibrary() or dlopen() / dlsym() / dlclose(). In that case, you have to manually get a pointer to each of the symbols you wish to use, and that includes the global variables you wish to use. For global variables, you can use GetProcAddress() or dlsym() just the same as you do for functions, provided that the global variables are part of the exported symbol list (by the rules of the previous paragraphs).
And of course, as a necessary final note: global variables should be avoided. And I believe that the text you quoted (about things being "unclear") is referring exactly to the platform-specific differences that I just explained (dynamic libraries are not really defined by the C++ standard, this is platform-specific territory, meaning it is much less reliable / portable).
The answer left by Mikael Persson, although very thorough, contains a severe error (or at least misleading), in regards to the global variables, that needs to be cleared up. The original question asked if there were seperate copies of the global variables or if global variables were shared between the processes.
The true answer is the following: There are seperate (multiple) copies of the global variables for each process, and they are not shared between processes. Thus by stating the One Definition Rule (ODR) applies is also very misleading, it does not apply in the sense they are NOT the same globals used by each process, so in reality it is not "One Definition" between processes.
Also even though global variables are not "visible" to the process,..they are always easily "accesible" to the process, because any function could easily return a value of a global variable to the process, or for that matter, a process could set a value of a global variable through a function call. Thus this answer is also misleading.
In reality, "yes" the processes do have full "access" to the globals, at the very least through the funtion calls to the library. But to reiterate, each process has it's own copy of the globals, so it won't be the same globals that another process is using.
Thus the entire answer relating to external exporting of globals really is off topic, and unnecessary and not even related to the original question. Because the globals do not need extern to be accessed, the globals can always be accessed indirectly through function calls to the library.
The only part that is shared between the processes, of course, is the actual "code". The code only loaded in one place in physical memory (RAM), but that same physical memory location of course is mapped into the "local" virtual memory locations of each process.
To the contrary, a static library has a copy of the code for each process already baked into the executable (ELF, PE, etc.), and of course, like dynamic libraries has seperate globals for each process.
In unix systems:
It is to be noted , that the linker does not complain if two dynamic libraries export same global variables. but during execution a segfault might arise depending on access violations. A usual number exhibiting this behavior would be segmentation fault 15
segfault at xxxxxx ip xxxxxx sp xxxxxxx error 15 in a.out
* Question revised (see below) *
I have a cpp file that defines a static global variable e.g.
static Foo bar;
This cpp file is compiled into an executable and a shared library. The executable may load the shared library at run time.
If I am on Linux there seem to be two copies of this variable. I assume one comes from the executable and one from the shared library. Other platforms (HP, Windows) there seems to be only one copy.
What controls this behavior on Linux and can I change it? For example is there a compiler or linker flag that will force the version of this variable from the shared library to be the same as the one from the executable?
* Revision of question *
Thanks for the answers so far. On re-examining the issue it is not actually the problem stated above. The static global variable above does indeed have multiple copies on Windows, so no difference to what I see on Linux.
However, I have another global variable (not static this time) which is declared in a cpp file and as extern in a header file.
On Windows this variable has multiple copies, one in the executable and one in each dll loaded up, and on Linux it only has one. So the question is now about this difference. How can I make Linux have multiple copies?
(The logic of my program meant the value of the static global variable was dependent of the value of the non-static global variable and I started accusing the wrong variable as being the problem)
I strongly suggest you read the following. Afterwards, you will understand everything about shared libraries in Linux. As said by others, the quick answer is that the static keyword will limit the scope of the global variable to the translation unit (and thus, to the executable or shared library). Using the keyword extern in the header, and compiling a cpp containing the same global variable in only one of the modules (exe or dll/so) will make the global variable unique and shared amongst all the modules.
EDIT:
The behaviour on Windows is not the same as on Linux when you use the extern pattern because Windows' method to load dynamic link libraries (dlls) is not the same and is basically incapable of linking global variables dynamically (such that only one exists). If you can use a static loading of the DLL (not using LoadLibrary), then you can use the following:
//In one module which has the actual global variable:
__declspec(dllexport) myClass myGlobalObject;
//In all other modules:
__declspec(dllimport) myClass myGlobalObject;
This will make the myGlobalObject unique and shared amongst all modules that are using the DLL in which the first version of the above is used.
If you want each module to have its own instance of the global variable, then use the static keyword, the behaviour will be the same for Linux or Windows.
If you want one unique instance of the global variable AND require dynamic loading (LoadLibrary or dlopen), you will have to make an initialization function to provide every loaded DLL with a pointer to the global variable (before it is used). You will also have to keep a reference count (you can use a shared_ptr for that) such that you can create a new one when none exist, increment the count otherwise, and be able to delete it when the count goes to zero as DLLs are being unloaded.
The static qualifier applied to a namespace variable means that the scope of the variable is the translation unit. That means that if that variable is defined in a header and you include it from multiple .cpp files you will get a copy for each. If you want a single copy, then mark it as extern (and not static) in the header, define it in a single translation unit and link that in either the executable or the library (but not both).
What compiler did you use on each of these platforms? The behavior you're describing for Linux would be what I'd expect, the static global is only local to that particular file at compile time.
You may be able to work around your issue using the GCC visibility attribute or the visibility pragma
I don't know about HPUX, but on Windows, if you have an exe and a DLL, and they each declare global variables, then there will be two distinct variables. If you are only getting a single variable then one image must be importing the variable from the other.
I'm trying to understand what happens when modules with globals and static variables are dynamically linked to an application.
By modules, I mean each project in a solution (I work a lot with visual studio!). These modules are either built into *.lib or *.dll or the *.exe itself.
I understand that the binary of an application contains global and static data of all the individual translation units (object files) in the data segment (and read only data segment if const).
What happens when this application uses a module A with load-time dynamic linking? I assume the DLL has a section for its globals and statics. Does the operating system load them? If so, where do they get loaded to?
And what happens when the application uses a module B with run-time dynamic linking?
If I have two modules in my application that both use A and B, are copies of A and B's globals created as mentioned below (if they are different processes)?
Do DLLs A and B get access to the applications globals?
(Please state your reasons as well)
Quoting from MSDN:
Variables that are declared as global in a DLL source code file are treated as global variables by the compiler and linker, but each process that loads a given DLL gets its own copy of that DLL's global variables. The scope of static variables is limited to the block in which the static variables are declared. As a result, each process has its own instance of the DLL global and static variables by default.
and from here:
When dynamically linking modules, it can be unclear whether different libraries have their own instances of globals or whether the globals are shared.
Thanks.
This is a pretty famous difference between Windows and Unix-like systems.
No matter what:
Each process has its own address space, meaning that there is never any memory being shared between processes (unless you use some inter-process communication library or extensions).
The One Definition Rule (ODR) still applies, meaning that you can only have one definition of the global variable visible at link-time (static or dynamic linking).
So, the key issue here is really visibility.
In all cases, static global variables (or functions) are never visible from outside a module (dll/so or executable). The C++ standard requires that these have internal linkage, meaning that they are not visible outside the translation unit (which becomes an object file) in which they are defined. So, that settles that issue.
Where it gets complicated is when you have extern global variables. Here, Windows and Unix-like systems are completely different.
In the case of Windows (.exe and .dll), the extern global variables are not part of the exported symbols. In other words, different modules are in no way aware of global variables defined in other modules. This means that you will get linker errors if you try, for example, to create an executable that is supposed to use an extern variable defined in a DLL, because this is not allowed. You would need to provide an object file (or static library) with a definition of that extern variable and link it statically with both the executable and the DLL, resulting in two distinct global variables (one belonging to the executable and one belonging to the DLL).
To actually export a global variable in Windows, you have to use a syntax similar to the function export/import syntax, i.e.:
#ifdef COMPILING_THE_DLL
#define MY_DLL_EXPORT extern "C" __declspec(dllexport)
#else
#define MY_DLL_EXPORT extern "C" __declspec(dllimport)
#endif
MY_DLL_EXPORT int my_global;
When you do that, the global variable is added to the list of exported symbols and can be linked like all the other functions.
In the case of Unix-like environments (like Linux), the dynamic libraries, called "shared objects" with extension .so export all extern global variables (or functions). In this case, if you do load-time linking from anywhere to a shared object file, then the global variables are shared, i.e., linked together as one. Basically, Unix-like systems are designed to make it so that there is virtually no difference between linking with a static or a dynamic library. Again, ODR applies across the board: an extern global variable will be shared across modules, meaning that it should have only one definition across all the modules loaded.
Finally, in both cases, for Windows or Unix-like systems, you can do run-time linking of the dynamic library, i.e., using either LoadLibrary() / GetProcAddress() / FreeLibrary() or dlopen() / dlsym() / dlclose(). In that case, you have to manually get a pointer to each of the symbols you wish to use, and that includes the global variables you wish to use. For global variables, you can use GetProcAddress() or dlsym() just the same as you do for functions, provided that the global variables are part of the exported symbol list (by the rules of the previous paragraphs).
And of course, as a necessary final note: global variables should be avoided. And I believe that the text you quoted (about things being "unclear") is referring exactly to the platform-specific differences that I just explained (dynamic libraries are not really defined by the C++ standard, this is platform-specific territory, meaning it is much less reliable / portable).
The answer left by Mikael Persson, although very thorough, contains a severe error (or at least misleading), in regards to the global variables, that needs to be cleared up. The original question asked if there were seperate copies of the global variables or if global variables were shared between the processes.
The true answer is the following: There are seperate (multiple) copies of the global variables for each process, and they are not shared between processes. Thus by stating the One Definition Rule (ODR) applies is also very misleading, it does not apply in the sense they are NOT the same globals used by each process, so in reality it is not "One Definition" between processes.
Also even though global variables are not "visible" to the process,..they are always easily "accesible" to the process, because any function could easily return a value of a global variable to the process, or for that matter, a process could set a value of a global variable through a function call. Thus this answer is also misleading.
In reality, "yes" the processes do have full "access" to the globals, at the very least through the funtion calls to the library. But to reiterate, each process has it's own copy of the globals, so it won't be the same globals that another process is using.
Thus the entire answer relating to external exporting of globals really is off topic, and unnecessary and not even related to the original question. Because the globals do not need extern to be accessed, the globals can always be accessed indirectly through function calls to the library.
The only part that is shared between the processes, of course, is the actual "code". The code only loaded in one place in physical memory (RAM), but that same physical memory location of course is mapped into the "local" virtual memory locations of each process.
To the contrary, a static library has a copy of the code for each process already baked into the executable (ELF, PE, etc.), and of course, like dynamic libraries has seperate globals for each process.
In unix systems:
It is to be noted , that the linker does not complain if two dynamic libraries export same global variables. but during execution a segfault might arise depending on access violations. A usual number exhibiting this behavior would be segmentation fault 15
segfault at xxxxxx ip xxxxxx sp xxxxxxx error 15 in a.out