We have a need for multiple programs to call functions in a common library. The library functions access and update a common global memory. Each program’s function calls need to see this common global memory. That is one function call needs to see the updates of any prior function call even if called from another program.
For compatibility reasons we have several design constraints on how the functions exposed by the shared library must operate:
Any data items (both standard data types and objects) that are declared globally must be visible to all callers regardless of the thread in which the code is running.
Any data items that are declared locally in a function are only visible inside that function.
Any standard data type or an instance of any class may appear either locally or globally or both.
One solution is to put the library’s common global memory in named shared memory. The first library call would create the named shared memory and initialize it. Subsequent program calls would get the address of the shared memory and use it as a pointer to the global data structure. Object instances declared globally would need to be dynamically allocated in shared memory while object instances declared locally could be placed on the stack or in the local heap of the caller thread. Problems arise because initialized objects in the global memory can create and point to sub-objects which allocate (new) additional memory. These new allocations also need to be in the shared memory and seen by all library callers. Another complication is these objects, which contain strings, files, etc., can also be used in the calling program. When declared in the calling program, the object’s memory is local to the calling program, not shared. So the object’s code needs to handle either case.
It appears to us that the solution will require that we override the global placement new, regular new and delete operators. We found a design for a memory management system that looks like it will work but we haven’t found any actual implementations. If anyone knows of an implementation of Nathan Myers’ memory management design (http://www.cantrip.org/wave12.html?seenIEPage=1) I would appreciate a link to it. Alternatively if anyone knows of another shared memory manager that accommodates dynamically allocating objects I would love to know about it as well. I've checked the Boost libraries and all the other sources I can find but nothing seems to do what we need. We prefer not to have to write one ourselves. Since performance and robustness are important it would be nice to use proven code. Thanks in advance for any ideas/help.
Thanks for the suggestions about the ATL and OSSP libraries. I am checking them out now although I'm afraid ATL is too Wincentric if are target turns out to be Unix.
One other thing now seems clear to us. Since objects can be dynamically created during execution, the memory management scheme must be able to allocate additional pages of shared memory. This is now starting to look like a full-blown heap replacement memory manager.
Take a look at boost.interprocess.
OSSP mm - Shared Memory Allocation:
man 3 mm
As I'm sure you have found, this is a very complex problem, and very difficult to correctly implement. A few tips from my experiences. First of all, you'll definitely want to synchronize access to the shared memory allocations using semaphores. Secondly, any modifications to the shared objects by multiple processes need to be protected by semaphores as well. Finally, you need to think in terms of offsets from the start of the shared memory region, rather than absolute pointer values, when defining your objects and data structures (it's generally possible for the memory to be mapped at a different address in each attached process, although you can choose a fixed mapping address if you need to). Putting it all together in a robust manner is the hard part. It's easy for shared memory based data structures to become corrupted if a process should unexpectedly die, so some cleanup / recovery mechanism is usually required.
Also study mutexes and semaphores. When two or more entities need to share memory or data, there needs to be a "traffic signal" mechanism to limit write access to only one user.
Related
I'm currently working on a project requiring per thread storage for caching fetched data. I'm looking into boost::thread_specific_ptr for solution, but I'm still not very clear on the following aspects:
Where is that object thread_specific_ptr pointed to actually allocated in process address space? Are those in special segments like bss, data or others? Will that be protected so that any other threads in the same process could not examine the address where the object is located? If it is a special memory section, will it be dangerous to use STL containers in boost::thread_specific_ptr since it could auto resize itself when more data is added till crossing the section boundary?
Thanks in advance!
The thread-local pointers are platform dependent, but the object you store via it is just on the heap.
The only thing really thread local is a pointer to it, and the operating system/runtime library will have some thread-associated storage to keep them in. That's an implementation detail, and you don't have to worry about it.
If you plan on storing a lot of containers, consider storing one structure that contains (pointers to) them all.
Also, use thread-local storage sparingly. Depending on them is a design smell in my book. You might simply require process isolation, instead of threads.
I am currently writing a library that parse some structured binary data into a set of objects. These objects are expected to outlive any user code, and would normally be freed at the end or after the main function.
I am using shared (and weak) pointers to manage the memory of each object, but it is causing a lot of added complexity to the program, and raises structural issues that I will not get into in this particular question.
Considering that:
traversing the entirety of the binary data is expensive and I cannot afford to do it more than one time,
each visited entry is used to build an object, that then gets registered (i.e. added into the set),
entries in the binary data may rely on other entries that appears later but gets parsed immediately, and registered when the entry is visited again,
duplicate entries may appear at any moment, but I need to merge those duplicates into one instance (and update any pointer referencing those duplicates to the new merged entry) before registration,
every single one of those objects is guaranteed to be of one of many POD types deriving a common class, so nothing except memory needs to be cleaned up,
the resulting program will run on a modern OS (or in this case, that collects memory from dead processes),
I am very tempted to just use raw pointers, never free the memory taken by those objects and let the OS do its cleanup after the process exits.
What would be the best course of action?
If you're writing reusable code, you need to at least provide the option of cleaning up. What if some program uses your library for one operation, and then continues running? It's not safe to assume that the process exits immediately after your library's task is complete.
The other answers cover the general and standard approach: in an ideal world, yes, you'd clean up your memory, because it makes the code more generic and more reusable and helps with tooling. As others have said, std::unique_ptr for owning pointers and raw pointers for non-owning pointers should work well.
There are a couple of more specialized approaches that may or may not be useful:
Use a pool allocator (such as Boost.Pool, or roll your own) to allocate a bunch of memory up front then dole out pieces of it for your objects. You can then free every object at once by deleting the pool.
Intentionally not freeing memory is occasionally a valid technique. See, e.g., "Increasing Compiler Performance by Over 75%", by Walter Bright. Of course, a compiler is a specialized problem domain, and Walter Bright is probably one of the top compiler developers alive, so techniques that work for his problem domain shouldn't be blindly applied elsewhere.
the resulting program will run on a modern OS (or in this case, that collects memory from dead processes)
I am very tempted to just use raw pointers, never free the memory taken by those objects and let the OS do its cleanup after the process exits.
If you take this approach, then anyone who uses your library and then uses valgrind to try to detect memory leaks in their program will report massive leaks coming from your library and complain to you about it, so if I were you I definitely would not do this.
If you are writing a library then you should provide a cleanup function that frees all memory that you allocated.
A practical example of why this is useful is if a Windows DLL uses your library. When the library is loaded, static data is initialized. When the library is unloaded, static data is cleared. If your library has some global pointers to memory that is never freed, then load-unload cycles of the DLL will leak memory.
If the objects are all of the same type, then rather than allocating each one independently, you could just put them all into a vector and have them refer to each other by index number instead of using pointers. The vector's built-in memory management takes care of allocating space as needed, and when you're done with the objects, you can just destroy the vector to deallocate them all at once. (Note that vector::clear() doesn't actually free the memory, though it does make it available to store a new set of objects in the vector.)
If your objects aren't all the same type, you'll want to look into the more general concept of region-based memory management. As above, the idea is that you can allocate all your objects in a relatively small number of memory chunks (possibly just one), which can be freed later without having to track all the
individual objects allocated within.
If your ownership and lifetimes are clear I suggest you use unique_ptr for the owning pointers and raw pointers for the non-owning pointers. It should be less complex than shared_ptr and weak_ptr whilst still managing memory automatically.
I don't think not managing memory at all is an option. But I think using smart pointers to express ownership is not just about good memory management it also makes code easier to reason about.
Try to think of future maintenance work. Suppose your code needs to be broken up or other stuff done after it. In this case you're opening yourself up to leaks or being a resource hog later down the line.
Cleaning up (or being able to do s) is good. It may seem obvious now that an application should work with a single structured binary dataset throughout its entire lifetime, but you'll start kicking yourself once you realize you need to write an application that needs to reset half-way through and start over with another dataset.
(a related thing that's easy to overlook is that an application may need to work with two completely independent datasets at the same time, so try not to design your library to exclude that use case!)
That said, I think you may be focusing too much on the extremes. Code that shouldn't participate in memory management can use raw pointers, and this is reasonable when there is no risk of these pointers outliving your structured dataset in memory.
However, that doesn't mean that code that does participate in memory management needs to use raw pointers too. You can use smart pointers to manage your data structures even if you are passing raw pointers out to the user.
That aside, keep in mind that, in my experience, pointers are usually the wrong semantics — usually, most use cases are most natural with reference or value semantics, which means you should be passing around raw references, or passing around lightweight wrapper class that have reference or value semantics, but are implemented as containing a pointer to the actual data. Or even as a copy of the actual data if appropriate.
I'm guessing not (or, if possible will almost certainly not be worth my time).
The way I'm thinking is to construct a mirror object that contain's offset_ptr's that also sits in the class owns the lua_State that the child processes can use to obtain the locations of the relevant pointers to the state, whether or not that's feasible... there's also other objects that lua_ methods would probably access that I'm not sure how I would pass them the correct addresses...
Guessing I would need a special allocator too, not sure if this is supported?
Since Lua is implemented purely in standard C, allocating a lua_State in shared memory is clearly not supported out-of-the-box. You could look at modifying the source to implement that functionality manually, but it probably wouldn't be worth the trouble. Instead you should keep lua_States out of shared memory, and just copy any important data into shared memory if necessary.
I always read about the possibility of rewriting a new definition for the smart pointers behavior, but still today i can't find a real example.
Now I want to propose this problem and see if I can get a solution:
Smart pointers are using reference counting or reference linking to manage their lifetime cycle, my basic problem consist in adding a new state that can cause the release and the deletion of my pointers, I would like to free my resources when an event is triggered.
It's more or less like when playing a game, usually all the resources are loaded and freed when the user is passing from the level 1 to the level 2, so when this happen the resources from the level 1 are freed. Also I would like to stick with this example because you can't wait for the automatic reference counting and maybe think that, ok if in the level 2 a resource from the level 1 is not used it will be automatically freed because it's no longer requested; this can be true but operating with the memory when an user is using the machine under stress is a really bad move.
I would like to stick with the smart pointers because I am also interested in all the other features that they offer, but they have this big downside for me and I need to manage their life cycle in a direct way.
What options do I have ?
It sounds like you just threw in a bunch of shared pointers and didn't bother thinking proper about ownership of the resources. From the description it seems clear to me that the level object should own all these resources. That means they should not be shared, and thus no reference counting is needed at all.
Using smart pointers frees you from the burden of releasing the resources manually at the appropriate time, but it doesn't free you from thinking about what the appropriate time is.
If I am correct and level should own the resources, use a smart pointer that gives unique ownership, or just use automatic objects and don't even bother with smart pointers. Every other object except the level that needs access to those objects needs a non-owning pointer or reference to it, i.e., a traditional pointer or reference.
If I am wrong and the resources should indeed be shared, then they should not be released when the level is destroyed: the other objects that were sharing ownership will not like it.
If you have shared-pointers remaining after a level is finished, that should be due to the fact that the shared-pointers were not allocated as automatic variables on the stack of functions called during the life-time of the level, but rather are stored in some type of container or series of globally accessible containers. Thus the main issue you need to be concerned with is managing the life-cycle of the containers containing the shared-pointers that manage the resources for each level.
For instance, a smart-pointer allocated on the stack inside some function foo will only have a life-time the corresponds to the duration of the function call. Once the function call is complete, then the shared-pointer is destroyed. If there are additional shared-pointers still pointing to a resource, then the resource itself won't be destroyed, but those additional shared-pointers need to be residing in some other location other than the stack of the callee. So your job is to manage those "other locations", which I'm guessing are most likely some series of globally accessible containers.
Therefore flushing the containers that mange the data for each level should in-turn completely destroy the allocated resources for each level. If you want, you could use an event-driven interface or simple observer pattern for triggering the flush of the containers, or it could simply be done explicitly by the destructor for an object that manages the life-time of level resources.
In the end though it comes down to resource management ... just because a shared-pointer is designed to prevent memory leaks does not mean you shouldn't keep track of how or where they are allocated. If you centralize the storage of your shared pointers, then destroying the resources they manage will not be that big a deal.
As mentioned before, shared pointers for your particular problem might not be the right solution. However, you can still manage manually their destruction if you'd like to time slice the freeing up of your level for example (or maybe even wait later when you have some spare CPU cycles if the memory pressure isn't too high or a concern). Once you are done with a resource, simple queue it up for destruction in a global queue (or several queues, perhaps one per type or by priority, etc.). Later on simply process that queue and remove the references. This will trigger the object's destruction if it is indeed the last reference to it. You could easily check a timer every few iterations for example and check that once you've spent 1ms freeing stuff, quit and continue on the next frame.
Performance wise, using memory pools per level (or per asset package) makes more sense. It makes for easier memory management and you can sometimes get away with freeing the whole pool at once and skip calling the destructors on all the objects within (if you know they do nothing!).
It sounds like you really shouldn't be using shared pointers at all. There are ways to override the behaviors by specifying custom deallocators but I would consider a different type of "smart pointer".
The Apache web server uses memory pools to deallocate all of the resources associated with a request by having the request own a pool of memory. When you allocate memory inside of the server, you are required to identify the pool that you want to allocate from. The server maintains a handful of memory pools each with a different lifetime - one for the server instance, another for the module, another for each request, etc. This sounds like a better match to your situation.
The Apache code uses their Apache Portable Runtime for memory management. It is written in C and might not be the best match for what you are doing. It does look like Boost has a memory pool library as well though I have never used it.
Does a C++ shared library have its own memory space? Or does it share the caller process' one?
I have a shared library which contains some classes and wrapper functions.
One of this wrapper function is kinda:
libXXX_construct() which initializes an object and returns the pointer to the said object.
Once I use libXXX_construct() in a caller program where is the object placed?Is it in the "caller" memory space or is it in the library's memory space?
A linked instance of the shared library shares the memory space of the instance of the executable that linked to it, directly or indirectly. This is true for both Windows and the UN*X-like operating systems. Note that this means that static variables in shared libraries are not a way of inter-process communication (something a lot of people thinks).
All the shared libraries share the virtual memory space of their process. (Including the main executable itself)
Unless specified otherwise, a shared library will share memory with the process hosting it. Each process instance will then have its own copy.
However, on Windows it is possible to create shared variables which allow inter-process communication. You do so by putting them in the right kind of segment. By default Windows uses two kinds of segments: data segments are read/write unshared, whereas code segments are read-only executable and shared. However, the read-write and shared attributes are orthogonal. A shared read-write segment in a library can be used to store shared variables, and it will survive until the last process exits.
Be careful with C++, as that will happily run constructors and destructors on process start & exit, even if you put variables in shared segments.
For details, see Peering Inside the PE: A Tour of the Win32 Portable Executable File Format part 2 by Matt Pietrek.
The shared library has the same address space as its host process. It has to be that way, or else you wouldn't be able to pass pointers from one module to another since they wouldn't be able to dereference them.
But although they are in the same address space, that doesn't mean they all use the same memory manager. The consequence is that if you provide a function that allocates memory on behalf of the caller, then you should provide a corresponding function to free that memory, say, libXXX_destroy().
Your object exists in the caller's memory space (in fact the one memory space shared between the library and the main executable)
The share address space so you can share pointers, however they don't share allocator (at least not on windows).
This means that if you call new to allocate an object inside a shared library you must call delete inside the same library or strange things may happen.
It's true that a library will use up memory in each process that loads it. However, at least under Windows, when multiple processes load the same DLL, unmodified pages (including all code pages) are quietly shared under the covers. Also, they take up no space in the swap file, since they're backed by the original file.
I believe this is more complicated for .NET, due to JIT compilation, but would still be true for NGENed assemblies.
edit
This is a detail of the VM. However, you can also flag a segment in a DLL to be shared across processes.