Dynamic linking, memory usage and concurrency - c++

When an executable links with a static library, the executable contains only the necessary library parts, that are used in the code, right?
But I'm missing the part with - how the shared objects (the dynamic linked libraries) are used exactly?
As far as I know, they are not included in the executable, they are dynamically loaded using dlopen and this is done directly by the linker, right?
In this case, where's this library located in the memory? I mean, there are posts here, explaining that the dynamic libraries could reduce the memory usage, but how exactly? And if a dynamic library is somehow loaded into a shared memory (for several processes), how the kernel handles the concurrency in this case?
I realize this is something probably fundamental and sorry if this is a duplicate, I couldn't find such.
I am aware of Static linking vs dynamic linking and what I ask is a bit different.

The shared library is indeed loaded into memory that is shared between all "users" (all applications using the same library).
This is essentially done by reference-counting, so for each new user of the library, the reference is counted up. When an application exits, the reference count is counted down. If it gets to zero, the library is no longer needed, and will be removed from memory (quite possibly only when "memory is needed for something else", rather than "immediately"). The reference counting is done "atomically" by the kernel, so there is no conflict of concurrency.
Note that it's only the CODE in the shared library that is actually shared. Any data sections will be private per process.

the dynamic library is loaded only once for all the processes that are using them. The memory of the dynamic library is then mapped into the process adress space by the operating system. This way, it consumes its required memory only once. With static linking, all executables include the statically linked code. When the executable is loaded, the statically linked code is loaded as well. This means, a function that is included in 10 executables resides 10 times in memory.

Related

In dynamic linking, how does the .exe know where to search for the library when it is updated?

It is my understanding that when a C program uses dynamic linking, the compiled version of the program (.exe) stores the memory address of the library somewhere. How about when the program is installed on someone else's computer, isn't the location of the library different? Or, when you update the library, wouldn't its memory address be different?
Neither C nor C++ specifies how this works. It's different for the different operating systems and exe formats. To know the specifics you need to look into how your implementation does things.
The short answer to your question is that the OS sets the environment up within which your program runs. It has to attach the program to the right places, or at the least notify it. Generally you start your program and the format tells the OS what libraries it should load and then it links up the addresses in some way.
There's usually a way to do this manually as well and directly request a library to be loaded during runtime. The automatic linking of calls though may not happen in these cases.
Yes, location of the library is different on different computers. And yes, when you update the library, its memory address is different. That is why the address of dynamically linked function cannot be hardwired in the executable file. Instead, only its name and the name of hosting library (without path specification) is stored in PE format.
Before the program.exe starts, OS loader looks for required DLL, loads it to the virtual memory space of starting program, finds current addresses of required functions from this DLL and writes them to Imported Address Table (IAT).
When your program calls some dynamically linked function, it actually makes an indirect CALL of its address in IAT.

should I lock before dlopen?

I do have an *.so library, which obtains some information from system libraries using dlopen. Library can be used by multiple application simultaneously.
Maybe it is a silly question, but should I flock library before doing dlopen on it? I haven't found direct answer anywhere.
Similar to what was said in the comments, you don't need a semaphore(flock) unless you are accessing a shared resource that could change on you. (IE. accessing shared memory and needing to ensure concurrency of that data). The way dynamic loading ... dlopen()... works
Those two routines are actually simple wrappers that call back into
the dynamic linker. When the dynamic linker loads a library via
dlopen(), it does the same relocation and symbol resolution it does on
any other library, so the dynamically loaded program can without any
special arrangements call back to routines already loaded
Because of the way linking works, relocations and modifications to the GOT/PLT are done in the memory space of the ( processes calling dlopen ) not that where the shared object is mapped.
If a hundred processes use a shared library, it makes no sense to have
100 copies of the code in memory taking up space. If the code is
completely read-only, and hence never, ever, modified
Having the shared objects being in read-only memory you never need to worry about them suddenly changing on you sooo no need for a flock :)!
Note: Because you have a shared object linking to other shared objects... the GOT of the initial shared object needs to be updated/mod with the relocations of the libraries being loaded with dlopen() ... but that is stored in a r/w segment of process unique memory space not in that of the shared objects.
the shared library must still have a unqiue data instance in each
process...the read-write data section is always put at a known offset
from the code section of the library. This way, via the magic of
virtual-memory, every process sees its own data section but can share
the unmodified code

Whether the complete library will get loaded to memory (RAM) while execution of a program?

How it will be different in case of static and dynamic library?
I am having understanding of static and dynamic library creation and use, but I am having doubt regarding loading of library to primary memory. whether static/dynamic library will get fully loaded to RAM if we are calling only one function from library.
e.g. consider we are having 10mb size of library and we are calling only one function from that library whether complete library will get loaded or only the called function object code will get load? and is it same in case of static and dynamic library?(if we are using static library executable size will be more but what about loading time)
thanks in advance!
Linux (as all modern OS with on-demand-paging) will map your whole library on load, but only page in those pages it has to read, e.g.: Init the libraries, Resolve all external (non-delayed) symbols.
Those tasks are mostly delegated to a user-mode dynamic loader.
Parts of your images never written, or remerged afterwards by KSM (Kernel Samepage Merger), can be stored only once, relieving memory pressure.
When dynamic linking is needed, the kernel bootstraps the dynamic
linker (ELF interpreter), which initializes itself, and then loads the
specified shared objects (unless already loaded).
IBM: Anatomy of Linux dynamic libraries

Will using shared library in place of static library effect memory usage?

I am linking against 10 static library.
My binary file size is getting reduced when I am using dynamic library.
As I know using dynamic library will not reduce memory usage.
But my senior told me that using shared library will also reduce memory usage ? (when multiple process are running for the same executable code. )
Is that statement is right ?
he told me that as there will no duplicate copy of function used in library , so memory usage will be less. when you create n instance of that process.
When the process start it fork it's 10 children. So will using dynamic library in place of static library reduce total memory usages ?
In your example, dynamic libraries won't save you much. When you fork your process on a modern OS all the pages are marked copy on write rather than actually copied. So your static library is already shared between your 10 copies of your process.
However, where you can save is when the dynamic library is shared between different processes rather than forks of the same process. So if you're using the same glibc.so as another process, the two processes are sharing the physical pages of glibc.so, even though they are otherwise unrelated processes.
If you fork given process there shouldn't be much of a difference, because most operating systems use copy-on-write. This means that pages will only be copied if they're updated, so things like the code segments in shared libraries shouldn't be affected.
On the other hand different processes won't be able to share code if they're statically linked. Consider libc, which practically every binary links against... if they were all statically linked you'd end up with dozens of copies of printf in memory.
The bottom line is you shouldn't link your binaries statically unless you have an excellent reason for it.
Your senior in this instance is correct. A single copy of the shared library will be loaded into memory and will be used by every program that references it.
There is a post regarding this topic here:
http://www.linuxquestions.org/linux/articles/Technical/Understanding_memory_usage_on_Linux

How do I decide whether having more that one VC++ CRT state is a problem for my application?

This MSDN article says that if my application loads VC++ runtime multiple times because either it or some DLLs it depends on are statically linked against VC++ runtime then the application will have multiple CRT states and this can lead to undefined behaviour.
How exactly do I decide if this is a problem for me? For example in this MSDN article several examples are provided that basically say that objects maintained by C++ runtime such as file handles should ot be passed across DLL boundaries. What exactly is a list of things to check if I want my project to statically link against VC++ runtime?
It's OK to have multiple copies of the CRT as long as you aren't doing certain things...:
Each copy of the CRT will manage its own heap.
This can cause unexpected problems if you allocate a heap-based object in module A using 'new', then pass it to module B where you attempt to release it using 'delete'. The CRT for module B will attempt to identify the pointer in terms of its own heap, and that's where the undefined behavior come in: If you're lucky, the CRT in module B will detect the problem and you'll get a heap corruption error. If you're unlucky, something weird will happen, and you won't notice it until much later...
You'll also have serious problems if you pass other CRT-managed things like file handles between modules.
If all of your modules dynamically link to the CRT, then they'll all share the same heap, and you can pass things around and not have to worry about where they get deleted.
If you statically link the CRT into each module, then you have to take care that your interfaces don't include these types, and that you don't allocate using 'new' or malloc in one module, and then attempt to clean up in another.
You can get around this limitation by using an OS allocation function like MS Windows's GlobalAlloc()/GlobalFree(), but this can be a pain since you have to keep track of which allocation scheme you used for each pointer.
Having to worry about whether the target machine has the CRT DLL your modules require, or packaging one to go along with your application can be a pain, but it's a one-time pain - and once done, you won't need to worry about all that other stuff above.
It is not that you can't pass CRT handles around. It is that you should be careful when you have two modules reading/writing the handle.
For example, in dll A, you write
char* p = new char[10];
and in the main program, write
delete[]p;
When dll A and your main program have two CRTs, the new an delete operations will happen in different heaps. The heap in the DLL can't manage the heap in the main program, resulting memory leaks.
Same thing on File handle. Internally File handle may have different implementations in addition to its memory resource. Calling fopen in one module and calling fwrite and fclose in another one could result problems as the data FILE* points could be different in CRT runtimes.
But you can certainly pass FILE* or memory pointer back and forth between two modules.
One issue I've seen with having different versions of the MSVC runtime loaded is that each runtime has its own heap, so you can end up with allocations failing with "out of memory" even though there is plenty of memory available, but in the wrong heap.
The major problem with this is that the problems can occur at runtime. At the binary level, you've got DLLs making function calls, eventually to the OS. The calls themselves are fine, but at runtime the arguments are not. Of course, this can depend on the exact code paths exercised.
Worse, it could even be time- or machine-dependent. An example of that would be a shared resource, used by two DLLs and protected by a proper reference count. The last user will delete it, which could be the DLL that created it (lucky) or the other (trouble)
As far as I know, there isn't a (safe) way to check which version of the CRT a library is statically linked with.
You can, however, check the binaries with a program such as DependencyWalker. If you don't see any DLLs beginning with MSVC in the dependency list, it's most likely statically linked.
If the library is from a third part, I'd just assume it's using a different version of the CRT.
This isn't 100% accurate, of course, because the library you're looking at could be linked with another library that links with the CRT.
if CRT handles are a problem, then don't expose CRT handles in your dll exports :-)
don't pass memory pointers around (if an exported class allocates some memory, it should free it... which means don't mess with inline constructor/destructor?)
This MSDN article (here) says that you should be alterted to the problem by linker error LNK4098. It also suggests that passing any CRT handles across a CRT boundary is likely to cause trouble, and mentions locale, low-level file I/O and memory allocation as examples.