I have an application that consists of a host program and a few plugins. The plugins are implemented in dlls that can be dynamically loaded and unloaded. The code of the plugins (inside the dlls) allocate memory for objects and pass the pointers to the host program. These objects are allocated on the dll heap and there is no way to change the interface to use a specialized memory allocation function.
Is there a way for the host program to detect on which heap manager an object has been allocated? I want to implement some kind of reference counting for the dll. As long as the host program still uses memory from the dll, the dll cannot be unloaded. That means I would like to track from the host program who (which plugin) allocated a memory block (objects that are kept in various lists inside the host). The current interface includes a call to unload a dll. This call should schedule for unload, but only execute it once the host is finished using the memory.
Thanks for any suggestions.
Related
If I enumerate heaps in my process using GetProcessHeaps API, is there a way to tell which module(s) were those heaps created by?
Here's why I need this: For the purpose of my security application I need to lock virtual memory used by my process (i.e. memory used by the Windows common controls, anything allocated via the new operator, COM, etc.)
The reason I need to know which module created the heap is to eliminate any DLLs that can be loaded into my process that have nothing to do with it. Say, as an example, TeamViewer loads into running processes to add whatever-they-need-it-for, so I don't want to lock its private heap, if it has one, etc.
If you are only concerned with you own allocations then you can just use your own private heap and override the default new and delete handlers to use your heap.
As per Microsoft (see first point in this), a DLL can have only one instance of itself running in a system at one time, but from what I read at other places online including here on SO, processes can load multiple instances of the same dll and read only data in dll may be shared using memory mapping techniques but each process has its own copy of write data from dll stored in its own memory space.
Also, in the second point at the same link, a DLL can't have its own stack, memory handles, global memory, etc, but from what I understand, since there can be multiple functions exported and/or inside a dll, these must have their own stacks, file handles, etc. And why can't a global variable defined in a DLL be considered as using global memory?
I'm working in C++.
the context of a DLL used in singular won't make much sense. To get better understanding, use DLL's in conjunction with the context of being loaded in a process.
The documentation is correct. Threads that are part of code/exported functions within DLL will have their stack. Processes have Memory handles, global memory..etc not individual threads.
If you have a global variable defined in DLL, its global in the context of the process that it's mapped to.
If a DLL is mapped to multiple processes, then each process gets it's own global variable.
It's part of maintaining process isolation/integrity (each process has it's own memory area, handle tables..etc)
HTH
I was reading that all a process's memory is released by the OS when the process terminates (by any means) so negating the need to call every dtor in turn.
Now my question is how does the memory of a DLL or SO relate to clean up of alloc'd memory?
I ask because I will probably end up using a Java and/or C# to call into a C++ DLL with some static C style functions which will allocate the C++ objects on the heap. Sorry if I got carried away with the heap vs stack thread, I feel I have lost sight of the concept of _the_ heap (ie only one).
Any other potential pitfalls for memory leaks when using libraries?
The library becomes part of the process when it is loaded. Regarding tidy up of memory, handles, resources etc., the system doesn't distinguish whether they were created in the executable image or the library.
There is nothing for you to worry about. The operating system's loader takes care of this.
In general, shared libraries will be made visible to your process's address space via memory mapping (all done by the loader), and the OS keeps track of how many processes still need a given shared library. State data that is needed separately per process is typically handled by copy-on-write, so there's no danger that your crypto library might accidentally be using another process's key :-) In short, don't worry.
Edit. Perhaps you're wondering what happens if your library function calls malloc() and doesn't clean up. Well, the library's code becomes part of your process, so it is really your process that requests the memory, and so when your process terminates, the OS cleans up as usual.
I have a program (not mine, have no source code) which exposes an interface so I can write a DLL which will be called by my program. Now I wondered when I declare some variable in this DLL I make, in what memory space is this going to be stored?
I mean, it's just gonna sit in the memory space of the EXE's address space, right? How is the DLL loaded in regards to the EXE though? I thought a DLL was only ever loaded in memory once, so how does that work in relation to me creating local variables in my DLL? (like objects, classes etc)
A DLL is loaded once per process. Once upon a time DLLs were shared between processes, but that hasn't been the case since Windows 3.1 went the way of the dodo.
Any global variables that you declare in your DLL will be stored in a data page. A different page from the EXE's global variables, mind.
Now, if you allocate memory on the heap, whether or not your allocations are mixed in with the EXEs depend on which heap you use. If both EXE and DLL use the same runtime linked as a DLL then they will both get memory from the same heap. If they have different runtimes, or link against runtime statically, they'll get different heaps. This becomes a very big can of worms, so I shan't go any further here.
Your DLL will declare a DllMain which is the equivalent to the entry point in a regular executable. When your DLL is loaded your DLLMain gets called. Here is a link to the best practices of what should be done in there.
Usually you will do some sort of intialisation there. When your DLL is loaded, it is loaded into the virtual memory space of the executable that called LoadLibrary. LoadLibrary handles all the mapping and relocations that need to be dealt with. From this point all memory you allocate or modify through your DLL is in the same virtual memory space as the process it's mapped into.
Presumably the executable interfaces by loading your DLL then calling some sort of exported function in it. Basically everything that you do once your DLL is loaded will be within the memory space of the process it is loaded into.
If you want to know more about exactly what goes on when your DLL is loaded you should look into the semantics of LoadLibrary().
I have few doubts regarding how windows manages a .dll's memory.
when .dll's are loaded into the host
process, how is the memory managed?
Does .dll get access to the entire
memory available to the host process
or just a portion of it? i.e is
there a limitation when memory is
allocated by a function inside the
.dll?
Will STL classes like string, vector (dynamically
increasing storage) etc used by the
dll, work without issue here?
"Memory management" is a split responsibility, typically. The OS hands address space in big chunks to the runtime, which then hands it out in smaller bits to the program. This address space may or may not have RAM allocated. (If not, there will be swap space to back it)
Basically, when a DLL is loaded, Windows allocates address space for the code and data segements, and calls DllMain(). The C++ compiler will have arranged to call global ctors from DllMain(). If it's DLL written in C++, it will likely depend on a C++ runtime DLL, which in turn will depend on Kernel32.DLL and User32.DLL. Windows understands such dependencies and will arrange for them to be loaded in the correct order.
There is only one address space for a provess, so a DLL will get access to all memory of the process. If a DLL is loaded in two processes, there will be two logical copies of the code and the data. (copies of the code and read-only data might share the same physical RAM though).
If the DLL allocates memory using OS functions, Windows will allocate the memory to the process from which the DLL made that allocation. The process must return the memory, but any code in the process may do so. If your DLL allocates memory using C++ functions, it will do so by calling operator new in the C++ runtime DLL. That memory must be returned by calling operator delete in the (same) C++ runtime DLL. Again, it doesn't matter who does that.
STL classes like vector<> can be multiply instantiated, but it doesn't matter as long as you're using the same compiler. All instantiations will be substantially equal, and all will return the vector's memory to the same deallocation function.
There are 2 main assumptions in this explanation:
The EXE and its DLLs are all compiled with the same compiler
The EXE and its DLLs all link against the C++ runtime DLL (i.e. not statically linked)
Static linking against the C++ runtime is useful if you want to ship an single, self-contained EXE. But if you're already shipping DLLs, you should keep the C++ runtime in its own DLL too.
Does .dll get access to the entire
memory available to the host process
or just a portion of it? i.e is there
a limitation when memory is allocated
by a function inside the .dll?
After a DLL has been loaded into the host process, there is no distinction whatsoever for code "living" in the DLL vs. code "living" in the original executable module. For the process being executed all memory ranges are the same, whether they come from a DLL or from the original executable.
There are no differences as to what the code from the DLL can do vs. what the code compiled in the original exec module can do.
That said, there are differences when using the heap - these are explained in the questions Space_C0wb0y provided the links for in the comments
Will STL classes like string, vector
(dynamically increasing storage) etc
used by the dll, work without issue
here?
They will create issues (solvable ones, but still) if you use them in the interface of your DLL. The will not (or should only under very rare circumstances) create issues if you do not use them on the DLL interface level. I am sure there are a few more specific questions+answers around for this.
Basically, if you use them at the interface level, the DLL and the EXE have to be compiled with "exactly" the same flags, i.e. the types need to be binary compatible. I.e. if the comiler flags (optimization, etc.) in your DLL differ from the ones in the EXE such that a std::string is layed out differently in memory in the EXE vs. the DLL, then passing a string object between the two will result in a crash or silent errors (or demons flying out of your nose).
If you only use the STL types inside of functions or between functions internal to your DLL, then their compatibility with the EXE doesn't matter.