Memory mapping, virtual and physical memory in C++

Memory mapping, virtual and physical memory in C++ - c++

I am trying to solve the following issue: having a custom data container that manages a generic type, I need to allow for other application components to retrieve the container's internal pointer and use it as if it were a simple T* array region (without treating it as a more intelligent array holder). The problem is that this memory is, in a very special case, moved somewhere else and erased. So there are a plethora of components that are aware of the old data pointer and will use that one to access their required information.
The setup looks, pseudo-codeish, something like this:
container<T>
{
T* ptr;
public:
ContainerInterfaceCode..
}
Hypothesis:
T* ptr is a pseudo-address (may I call it "virtual"?) which is mapped in a physical space A.
When an event is risen, T* ptr's mapping will be set for another physical space, B.
Any component that uses T* ptr is then oblivious of the change of physical location, "thinking" its data is stored at that virtual address.
Conclusion:
I would therefore like to know whether there is a mechanism involving memory mapping (virtual to physical) that will allow to juggle with the mapping of the T* ptr, thus leaving other application components untouched. Simply put, T* ptr should point to a memory region that gets mapped in a certain part, and, upon request, that same pointer will be mapped in another place (where the underlying data is to be copied for consistency). This must allow seamless transitions.
Note: I can't use wrappers, smart pointers, handles, etc. for the simple fact that it means modifying a huge codebase just for one, rather minor, modification.
As I haven't found enough resources dealing with this scenario, can anyone, perhaps, present a short webography with some relevant reading material on the subject?

In linux you can used the shared memory.The shared memory is a mechanism which allow two process access a same area of memory, it is a kind of IPC method. You can find some more
details here http://en.wikipedia.org/wiki/Shared_memory.

Related

Can I pass a pointer using memory mapped files?

I have read the article about Memory-Mapped Files and the example about CreateFileMapping.
My question is: Can I pass a pointer to a struct or a object between two processes using memory-mapped file?
Since there are some answers that it is possible, here is struct that I want to pass:
// First Process
struct OtherStruct{};
struct MyStruct
{
unsigned long handleObject;
unsigned long *phandleObject;
OtherStruct someData;
OtherStruct *pData;
}
MyStruct dataSend = { ... };
WriteToMappedFile(data);
// Second Process
MyStruct dataReceived = ReadFromMappedFile()

As the other answers already stated, you must either rely on the address of the memory-mapped areas to be equal, or you must move from absolute addresses in your pointers to relative addressing.
One possible implementation I stumbled across recently is the offset_ptr in the Boost library, which seems to fit your use case perfectly.

The answer depends on what you want to achieve. Passing a pointer in shared memory is easy, but the other process may not be able to use it in the way you expect.
Note that a pointer contains a virtual address of the data structure it points to. Such a virtual address is only valid within the process that holds the pointed-to data structure. If you pass the pointer to another process, the other process will have its own virtual address space, and the passed pointer loses its validity.
So the answer to your question is: Yes, you can pass the pointer, but without further actions, you won't be able to successfully use this pointer in the receiving process. Specifically, you will most probably not be able to use it for accessing the struct or object it points to.
If you want to access the struct or object within the other process, you need to do the following:
Put the object itself into shared memory.
Convert the pointer to the object into an offset relative to the beginning of the memory mapped file.
Pass this offset to the other process
In the other process, use the offset to convert back to a pointer.
boost::offset_ptr can help you with part of that.

Assuming the Pointer is to a struct that is part of the same memory mapped region, yes that can make sense. but then you will have to ensure that the memory mapped region is mapped to the same virtual address, this is not always guaranteed and is a bad way to design things.
You can pass the offset instead and deal with relative offsets everywhere fpor structures present in this memory region.

If the pointer you want to pass into the memory-mapped-file is not allocated by GlobalAlloc, and not locked by GlobalLock, it can't. However, you have already the memory allocated to pass the data. So you can re-write the memory on the memory-mapped-file.

Relationships between C++ objects

I have a vector of journeys and a vector of locations. A journey is between two places.
struct Data {
std::vector<Journey> m_journeys;
std::vector<Locations> m_locations;
};
struct Journey {
?? m_startLocation;
?? m_endLocation;
};
How can I create the relationship between each journey and two locations?
I thought I could just store references/pointers to the start and end locations, however if more locations are added to the vector, then it will reallocate storage and move all the locations elsewhere in memory, and then the pointers to the locations will point to junk.
I could store the place names and then search the list in Data, but that would require keeping a reference to Data (breaking encapsulation/SRP), and then a not so efficient search.
I think if all the objects were created on the heap, then shared_ptr could be used, (so Data would contain std::vector<std::shared_ptr<Journey>>), then this would work? (it would require massive rewrite so avoiding this would be preferable)
Is there some C++/STL feature that is like a pointer but abstracts away/is independent of memory location (or order in the vector)?

No, there isn't any "C++/STL feature that is like a pointer but abstracts away/is independent of memory location".
That answers that.
This is simply not the right set of containers for such a relationship between classes. You have to pick the appropriate container for your objects first, instead of selecting some arbitrary container first, and then trying to figure out how to make it work with your relationship.
Using a vector of std::shared_ptrs would be one option, just need to watch out for circular references. Another option would be to use std::list instead of std::vector, since std::list does not reallocate when it grows.
If each Locations instance has a unique identifier of some kind, using a std::map, and then using that location identifier to refer to a location, and then looking it up in the map. Although a std::map also doesn't reallocate upon growth, the layer of indirection offers some value as well.

I'd say make a vector<shared_ptr<Location>>for your index of locations, and Journey would contain two weak_ptr<Location>.
struct Data {
std::vector<Journey> m_journeys;
std::vector<std::shared_ptr<Location>> m_locations;
};
struct Journey {
std::weak_ptr<Location> m_startLocation;
std::weak_ptr<Location> m_endLocation;
};
std::weak_ptr can dangle and that's exactly what you want. :)
The concern is that one could access a Journey containing a deleted Location. A weak pointer provides an expired() method that can tell you if the data of the parent shared pointer (that would be in your m_locations vector) still exists.
Accessing data from a weak pointer is safe, and will require the use of the lock() method.
Here is a great example of how one usually uses a weak pointer:
http://en.cppreference.com/w/cpp/memory/weak_ptr/lock

How to manage millions of game objects with a Slot Map / Object Pool pattern in C++?

I'm developing a game server for a video game called Tibia.
Basically, there can be up to millions of objects, of which there can be up to thousands of deletes and re-creations as players interact with the game world.
The thing is, the original creators used a Slot Map / Object Pool on which pointers are re-used when an object is removed. This is a huge performance boost since there's no need to do much memory reallocation unless needed.
And of course, I'm trying to accomplish that myself, but I've come into one huge problem with my Slot Map:
Here's just a few explanation of how Slot Map works according to a source I found online:
Object class is the base class for every game object, my Slot Map / object Pool is using this Object class to save every allocated object.
Example:
struct TObjectBlock
{
Object Object[36768];
};
The way the slot map works is that, the server first allocates, say, 36768 objects in a list of TObjectBlock and gives them a unique ID ObjectID for each Object which can be re-used in a free object list when the server needs to create a new object.
Example:
Object 1 (ID: 555) is deleted, it's ID 555 is put in a free object ID
list, an Item creation is requested, ID 555 is reused since it's on
the free object list, and there is no need to reallocate another
TObjectBlock in the array for further objects.
My problem: How can I use "Player" "Creature" "Item" "Tile" to support this Slot Map? I don't seem to come up with a solution into this logic problem.
I am using a virtual class to manage all objects:
struct Object
{
uint32_t ObjectID;
int32_t posx;
int32_t posy;
int32_t posz;
};
Then, I'd create the objects themselves:
struct Creature : Object
{
char Name[31];
};
struct Player : Creature
{
};
struct Item : Object
{
uint16_t Attack;
};
struct Tile : Object
{
};
But now if I was to make use of the slot map, I'd have to do something like this:
Object allocatedObject;
allocatedObject.ObjectID = CreateObject(); // Get a free object ID to use
if (allocatedObject.ObjectID != INVALIDOBJECT.ObjectID)
{
Creature* monster = new Creature();
// This doesn't make much sense, since I'd have this creature pointer floating around!
monster.ObjectID = allocatedObject.ObjectID;
}
It pretty much doesn't make much sense to set a whole new object pointer the already allocated object unique ID.
What are my options with this logic?

I believe you have a lot of tangled concepts here, and you need to detangle them to make this work.
First, you are actually defeating the primary purpose of this model. What you showed smells badly of cargo cult programming. You should not be newing objects, at least without overloading, if you are serious about this. You should allocate a single large block of memory for a given object type and draw from that on "allocation" - be it from an overloaded new or creation via a memory manager class. That means you need separate blocks of memory for each object type, not a single "objects" block.
The whole idea is that if you want to avoid allocation-deallocation of actual memory, you need to reuse the memory. To construct an object, you need enough memory to fit it, and your types are not the same length. Only Tile in your example is the same size as Object, so only that could share the same memory (but it shouldn't). None of the other types can be placed in the objects memory because they are longer. You need separate pools for each type.
Second, there should be no bearing of the object ID on how things are stored. There cannot be, once you take the first point into consideration, if the IDs are shared and the memory is not. But it must be pointed out explicitly - the position in a memory block is largely arbitrary and the IDs are not.
Why? Let's say you take object 40, "delete" it, then create a new object 40. Now let's say some buggy part of the program referenced the original ID 40. It goes looking for the original 40, which should error, but instead finds the new 40. You just created an entirely untrackable error. While this can happen with pointers, it is far more likely to happen with IDs, because few systems impose checks on ID usage. A main reason for indirecting access with IDs is to make access safer by making it easy to catch bad usage, so by making IDs reusable, you make them just as unsafe as storing pointers.
The actual model for handling this should look like how the operating system does similar operations (see below the divide for more on that...). That is to say, follow a model like this:
Create some sort of array (like a vector) of the type you want to store - the actual type, not pointers to it. Not Object, which is a generic base, but something like Player.
Size that to the size you expect to need.
Create a stack of size_t (for indexes) and push into it every index in the array. If you created 10 objects, you push 0 1 2 3 4 5 6 7 8 9.
Every time you need an object, pop an index from the stack and use the memory in that cell of the array.
If you run out of indexes, increase the size of the vector and push the newly created indexes.
When you use objects, indirect via the index that was popped.
Essentially, you need a class to manage the memory.
An alternative model would be to directly push pointers into a stack with matching pointer type. There are benefits to that, but it is also harder to debug. The primary benefit to that system is that it can easily be integrated into existing systems; however, most compilers do similar already...
That said, I suggest against this. It seems like a good idea on paper, and on very limited systems it is, but modern operating systems are not "limited systems" by that definition. Virtual memory already resolves the biggest reason to do this, memory fragmentation (which you did not mention). Many compiler allocators will attempt to more or less do what you are trying to do here in the standard library containers by drawing from memory pools, and those are far more manageable to use.
I once implemented a system just like this, but for many good reasons have ditched it in favor of a collection of unordered maps of pointers. I have plans to replace allocators if I discover performance or memory problems associated with this model. This lets me offset the concern of managing memory until testing/optimization, and doesn't require quirky system design at every level to handle abstraction.
When I say "quirky", believe me when I say that there are many more annoyances with the indirection-pool-stack design than I have listed.

Moving a STL object between processes

I know this is strange but I'm just having fun.
I am attempting to transmit a std::map (instantiated using placement new in a fixed region of memory) between two processes via a socket between two machines: Master and Slave. The map I'm using has this typedef:
// A vector of Page objects
typedef
std::vector<Page*,
PageTableAllocator<Page*> >
PageVectorType;
// A mapping of binary 'ip address' to a PageVector
typedef
std::map<uint32_t,
PageVectorType*,
std::less<uint32_t>,
PageTableAllocator<std::pair<uint32_t, PageVectorType*> > >
PageTableType;
The PageTableAllocator<T> class is responsible for allocating whatever memory the STL containers may want/need into a fixed location in memory. E.g., all Page objects and STL internal structures are being instantiated in this fixed memory region. This ensures that both the std::map object and the allocator are both placed in a fixed region of memory. I've used GDB to make sure the map and allocator behave correctly (all memory used is in the fixed region, nothing ever goes on the application's normal heap).
Assuming Master starts up, initializes all of it's STL structures and the special memory region, the following happens. Slave starts, prints out its version of the page table, then looks for a Master. Slave finds a master, deletes its version of the page table, copies Master's version of the page table (and the special memory region), and successfully prints it out the Master's version of the page table. From what prodding I've done in GDB I can perform many read-only operations.
When trying to add to the newly copied PageTableType object, Slave faults in the allocator's void construct (pointer p, const T& value) method. The value passed in as p points to an already allocated area of memory (as per Master's version of the std::map).
I don't know anything about C++ object structure, but I'm guessing that object state from Slave's version of the PageTableType must be hanging around even after I replace all of the memory that the PageTableType and its allocator used. My question is if this is a valid concern. Does C++ maintain some sort of object state outside of the area of memory that object was instantiate din?
All of the objects used in the map are non-POD. Same is true for the allocator.

To answer your specific question:
Does C++ maintain some sort of object state outside of the area of memory that object was instantiated in?
The answer is no. There are no other data structures set up to "track" objects or anything of the sort. C++ uses an explicit memory allocation model, so if you choose to be responsible for allocation and deallocation, then you have complete control.
I suspect there's something wrong in your code somewhere, but since you believe the code is correct you're inventing some other reason why your code might be failing, and following that path instead. I would pull back, and carefully examine everything about the way your code is working right now, and see if you can determine the problem. Although the STL classes are complex (especially std::map), they're ultimately just code and there is no hidden magic in there.

Access data in shared memory C++ POSIX

I open a piece of shared memory and get a handle of it. I'm aware there are several vectors of data stored in the memory. I'd like to access those vectors of data and perform some actions on them. How can I achieve this? Is it appropriate to treat the shared memory as an object so that we can define those vectors as fields of the object and those needed actions as member functions of the object?
I've never dealt with shared memory before. To make things worse, I'm new to C++ and POSIX. Could someone please provide some guidance? Simple examples would be greatly appreciated.

int my_shmid = shmget(key,size,shmflgs);
...
void* address_of_my_shm1 = shat(my_shmid,0,shmflags);
Object* optr = static_cast<Object*>(address_of_my_shm1);
...or, in some other thread/process to which you arranged to pass the address_of_my_shm1
...by some other means
void* address_of_my_shm2 = shat(my_shmid,address_of_my_shm1,shmflags);
You may want to assert that address_of_shm1 == address_of_shm2. But note that I say "may" - you don't actually have to do this. Some types/structs/classes can be read equally well at different addresses.
If the object will appear in different address spaces, then pointers outside the shhm in process A may not point to the same thing as in process B. In general, pointers outside the shm are bad. (Virtual functions are pointers outside the object, and outside the shm. Bad, unless you have other reason to trust them.)
Pointers inside the shm are usable, if they appear at the same address.
Relative pointers can be quite usable, but, again, so long as they point only inside the shm. Relative pointers may be relative to the base of an object, i.e. they may be offsets. Or they may be relative to the pointer itself. You can define some nice classes/templates that do these calculations, with casting going on under the hood.
Sharing of objects through shmem is simplest if the data is just POD (Plain Old Data). Nothing fancy.
Because you are in different processes that are not sharing the whole address space, you may not be guaranteed that things like virtual functions will appear at the same address in all processes using the shm shared memory segment. So probably best to avoid virtual functions. (If you try hard and/or know linkage, you may in some circumstances be able to share virtual functions. But that is one of the first things I would disable if I had to debug.)
You should only do this if you are aware of your implementation's object memory model. And if advanced (for C++) optimizations like splitting structs into discontiguous hot and cold parts are disabled. Since such optimizations rae arguably not legal for C++, you are probably safe.
Obviously you are better off if you are casting to the same object type/class on all sides.
You can get away with non-virtual functions. However, note that it can be quite easy to have the same class, but different versions of the class - e.g. differing in size, e.g. adding a new field and changing the offsets of all of the other fields - so you need to be quite careful to ensure all sides are using the same definitions and declarations.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js