C++: Vector of Pointers to Objects from another Vector - c++

I have two classes, similar to this:
class A
{
public:
B* ptr1;
}
class B
{
public:
std::vector<A*> list;
}
In the main implementation, I'm doing something like this:
int main() {
// there are a lot more A objects than B objects, i.e. listOfA.size() >>> listOfB.size()
std::vector<A> listOfA;
std::vector<B> listOfB;
while (//some loop)
{
listOfB[jj].list.push_back( &(listofA[ii]) );
listOfA[ii].ptr1 = &( listOfB[jj] );
}
} // int main end
Basically something like this. A lot of A objects are assigned to one B object, and these A objects are stored in that pointer vector as pointers. Additionally, each of these A objects get a pointer to the B object they belong to. For the context, I'm basically doing an Connected Components Algorithm with run-length-encoding (for image segmentation), where class A are the line segments and class B are the final objects in the image.
So, the pointers of the vector in class B all point to Objects which are stored in a regular vector. These objects should be deleted when the regular vector goes out of scope, right? I've read that a vector of pointer like in class B usually requires writing a manual destructor, but this shouldn't be the case here, I think...
The reason why I'm asking is, of course, because my code keeps crashing. I'm using an Asus Xtion Pro camera to get the images and then am performing the algorithm on every image. Weird thing is, the program crashes whenever I shake the camera a bit harder. When the camera is stationary or moved only a little or slowly, nothing happens. Also, when I use a different algorithm (also connected components, but without run-length-encoding and also doesn't use pointers), nothing crashes, no matter how much I shake the camera. Also, in Debug mode (which ran much slower than the Release mode), nothing crashed either.
I tried making a destructor for the pointer vector in class B, but it resulted in a "block is valid" error, so I guess it deleted something twice.
I also tried replacing every pointer wih a c++11 std::shared_ptr, but that only produced very irregular behaviour and the code still crashed when I shaked the camera.
I basically just want to know if in terms of memory leaking and pointer handling, the code shown above seems fine or if there are mistakes in the code which could lead to crashes.
EDIT (SOLVED): The solution (see the accepted answer) was to ensure that the vector 'listOfB' doesn't get resized during run-time, for example by using 'reserve()' to reserve enough space for it. After doing this, everything worked fine! Apparently it worked, because if the vector 'listOfB' gets resized (by push_back()), the internal memory adresses of the B instances in it are also changed, causing the pointers in the A instances (which point to B instances) to now point to the wrong adresses - and thus resulting in trouble, which lead to the crash.
About the camera shaking, apparently, shaking the camera resulted in very blurry pictures with lot of elements to segment, thus increasing the number of objects (i.e., resulting in higher size required for listOfB). So, mystery solved! Thanks a lot! :-)

I think the design is broken. listofB will grow (you do push_backs) and re-allocate its internal data array, invalidating all addresses stored in the ptrs of the A instances. The usual algorithm will grow the data size by a factor of 2 which may explain that you are good for a while if not too much data arrives. Also, as long as the memory of the old data is still in the address space of the program (especially if it is on the same memory page, for example because the new data fits in it as well), the program may not crash accessing it and just retrieve old data.
On a more constructive note: Your solution will work if you know the maximum elements in advance, which may be hard (think you get a 4k camera next year ;-)). In that case you can, by the way, just take a simple static array anyway.
Perhaps you could also use a std::map to store A objects instead of a simple vector listofA. Each A object would need a unique ID of some sort (static counter in A in the easiest case) to use as a key into the map. Bs would store keys, not addresses of As.

Assuming you have not made a mistake in how you build your network you should be fine. You would have to post more code to assess that. Also you can not use either of the vectors after you change one of them because if they reallocate their members all pointers pointing to them are invalidated. But using raw pointers to managed objects is the correct way to build networks.
By managed objects I mean objects whose lifetime is guaranteed to last longer than the network and whose memory will be automatically released. So they should be elements of a container or objects managed by some kind of smart pointer.
However it sounds like you have a hardware problem.

Related

Should I use reference member to deal with dependency between objects?

As a beginner in C++, I am practicing C++ with an algorithm assignment. Along the way, I have some questions that I have difficulty getting through. Pardon me if the questions sound entry-level since I am still learning.
The goal is to find collinear points in a given vector of points with three classes. The following briefly describes the three classes' purposes:
Point: Representing a point with x and y values.
LineSegment: Representing a line segment with two points at ends.
Collinear: Containing the segments found in a vector of Points. The main part of the algorithm.
So, I would expect the client code to look like this:
std::vector<Point> points; // may become huge
// populate points
// ...
Collinear collinear_points(points);
std::vector<LineSegment> segments_in_points = collinear_points.GetSegments();
Since the class Collinear depends on a certain vector of points to get the segments correspondingly, I think it would need it as a data member. The question that keeps haunting me is, should it hold a copy of the vector or hold a raw pointer/reference to the vector outside the object. I think a smart pointer would be an overkill here. According to the old answer here, maybe it is better to go with reference which also avoids potential expensive copying? What is the common practice for this kind of dependency between classes if any exists?
If points gets modified after the construction of collinear_points, then the data collinear_points is referencing will be inconsistent with the segments it contains. Is it common to leave the responsibility to users for making sure the validness of an object depending on other ones? Is there a way to let collinear_points know the content has been modified and put it in an invalid state?
To answer your actual question from the title: A non-owning raw pointer would still be the usual choice, mainly because that’s what we’ve been doing since the old C days. Of course, a pointer has the problem that it can be nullptr. So using a reference communicates more clearly that null is not an allowed value. Because of that I tend to use the reference, although it still feels a tiny bit weird even to myself. But imo it’s the better design decision overall
That said, I believe the real question here is one of ownership. If Collinear does not own the vector, the user of your API has to make sure that the vector lives at least as long as the associated Collinear object. Otherwise you’ll access a dangling pointer/reference and things tend to go downhill from there. ;)
Is there a way to let collinear_points know the content has been modified and put it in an invalid state?
Yes, there is. Own everything. That includes the points vector and the segments vector. Following this approach Collinear could look something like this:
class Collinear {
public:
// usings because I’m a lazy typer
using PointsVec = std::vector<Point>;
using SegmentsVec = std::vector<LineSegment>;
// Take ownership of the points vector by either copying
// or moving it into a member.
explicit Collinear(const PointsVec& p): m_points(p) {}
explicit Collinear(PointsVec&& p): m_points(std::move(p)) {}
// Retain ownership of the segments vector by returning a const ref.
const SegmentsVec& GetSegments(); // Check if you can make it const.
// access functions for the two vectors ...
private:
PointsVec m_points;
SegmentsVec m_segments;
}
Now Collinear controls access to the points vector. You’ll have to write the functions for the permitted operations as members of Collinear. The important thing is never to return a non-const pointer or non-const ref to m_points, because then you might miss write accesses.
The segments vector is similar. Your provide the write access member functions and Collinear retains ownership, which means it can re-calculate it when necessary and the user doesn’t need to be concerned with that. Depending on how expensive the calculation is you can now go wild with lazy evaluation and every optimization you can think of.
There is a completely different design approach, though. Own nothing. Does Collinear have to be a class at all? Could it be a bunch of free functions in a namespace?
namespace Collinear {
std::vector<LineSegment> GetSegments(const std::vector<Point>& points);
}
// ...
auto segments_in_points = Collinear::GetSegments(points);
That’s the opposite of the own-everything approach. Before, you had full control. Now your user has full control. On the other hand, they now have to take care of any laziness/optimizations/update detection.
Which approach is appropriate is a question of a) API design philosophy and b) your conrete situation. What are your users? What do they expect? Which approach makes their lives easier? Since this is an assignment, you probably won’t have any real users. So imagine a group of people that might want to use your code and decide based on that. Or just use the approach you’ll have more fun implementing. The important thing imo: Pick one of the two approaches. Don’t mix them, because such an API is inconsistent. That increases confusion, decreases ease of use, and makes errors more likely.
Btw:
I think a smart pointer would be an overkill here.
Using smart pointers is not a question of overkill. It’s a question of ownership. If you have an owning pointer never use a raw pointer for it. … Unless a legacy API forces you to. Even then it’s a great idea to mark it as owning with a transparent wrapper like gsl::owner<T>.
The question that keeps haunting me is, should it hold a copy of the vector or hold a raw pointer/reference to the vector outside the object
I think you should keep a copy of vector here to keep things generic.You don't want your collinear class depend on a particular vector<Points>, instead it should be like while creating instance of collinear class you just tell it on what vector<Points>, it has to work on.Then if you change this vector and you want collinear also to work on this data set(which you might not want), its your responsibility to tell collinear to work on new data set.If you want collinear to be updated automatically when you update vector<points>, you can do so, but you have to answer questions like what happens to the state of collinear(which would be depending on the vector<points>) when the data set changes.

How to manage millions of game objects with a Slot Map / Object Pool pattern in C++?

I'm developing a game server for a video game called Tibia.
Basically, there can be up to millions of objects, of which there can be up to thousands of deletes and re-creations as players interact with the game world.
The thing is, the original creators used a Slot Map / Object Pool on which pointers are re-used when an object is removed. This is a huge performance boost since there's no need to do much memory reallocation unless needed.
And of course, I'm trying to accomplish that myself, but I've come into one huge problem with my Slot Map:
Here's just a few explanation of how Slot Map works according to a source I found online:
Object class is the base class for every game object, my Slot Map / object Pool is using this Object class to save every allocated object.
Example:
struct TObjectBlock
{
Object Object[36768];
};
The way the slot map works is that, the server first allocates, say, 36768 objects in a list of TObjectBlock and gives them a unique ID ObjectID for each Object which can be re-used in a free object list when the server needs to create a new object.
Example:
Object 1 (ID: 555) is deleted, it's ID 555 is put in a free object ID
list, an Item creation is requested, ID 555 is reused since it's on
the free object list, and there is no need to reallocate another
TObjectBlock in the array for further objects.
My problem: How can I use "Player" "Creature" "Item" "Tile" to support this Slot Map? I don't seem to come up with a solution into this logic problem.
I am using a virtual class to manage all objects:
struct Object
{
uint32_t ObjectID;
int32_t posx;
int32_t posy;
int32_t posz;
};
Then, I'd create the objects themselves:
struct Creature : Object
{
char Name[31];
};
struct Player : Creature
{
};
struct Item : Object
{
uint16_t Attack;
};
struct Tile : Object
{
};
But now if I was to make use of the slot map, I'd have to do something like this:
Object allocatedObject;
allocatedObject.ObjectID = CreateObject(); // Get a free object ID to use
if (allocatedObject.ObjectID != INVALIDOBJECT.ObjectID)
{
Creature* monster = new Creature();
// This doesn't make much sense, since I'd have this creature pointer floating around!
monster.ObjectID = allocatedObject.ObjectID;
}
It pretty much doesn't make much sense to set a whole new object pointer the already allocated object unique ID.
What are my options with this logic?
I believe you have a lot of tangled concepts here, and you need to detangle them to make this work.
First, you are actually defeating the primary purpose of this model. What you showed smells badly of cargo cult programming. You should not be newing objects, at least without overloading, if you are serious about this. You should allocate a single large block of memory for a given object type and draw from that on "allocation" - be it from an overloaded new or creation via a memory manager class. That means you need separate blocks of memory for each object type, not a single "objects" block.
The whole idea is that if you want to avoid allocation-deallocation of actual memory, you need to reuse the memory. To construct an object, you need enough memory to fit it, and your types are not the same length. Only Tile in your example is the same size as Object, so only that could share the same memory (but it shouldn't). None of the other types can be placed in the objects memory because they are longer. You need separate pools for each type.
Second, there should be no bearing of the object ID on how things are stored. There cannot be, once you take the first point into consideration, if the IDs are shared and the memory is not. But it must be pointed out explicitly - the position in a memory block is largely arbitrary and the IDs are not.
Why? Let's say you take object 40, "delete" it, then create a new object 40. Now let's say some buggy part of the program referenced the original ID 40. It goes looking for the original 40, which should error, but instead finds the new 40. You just created an entirely untrackable error. While this can happen with pointers, it is far more likely to happen with IDs, because few systems impose checks on ID usage. A main reason for indirecting access with IDs is to make access safer by making it easy to catch bad usage, so by making IDs reusable, you make them just as unsafe as storing pointers.
The actual model for handling this should look like how the operating system does similar operations (see below the divide for more on that...). That is to say, follow a model like this:
Create some sort of array (like a vector) of the type you want to store - the actual type, not pointers to it. Not Object, which is a generic base, but something like Player.
Size that to the size you expect to need.
Create a stack of size_t (for indexes) and push into it every index in the array. If you created 10 objects, you push 0 1 2 3 4 5 6 7 8 9.
Every time you need an object, pop an index from the stack and use the memory in that cell of the array.
If you run out of indexes, increase the size of the vector and push the newly created indexes.
When you use objects, indirect via the index that was popped.
Essentially, you need a class to manage the memory.
An alternative model would be to directly push pointers into a stack with matching pointer type. There are benefits to that, but it is also harder to debug. The primary benefit to that system is that it can easily be integrated into existing systems; however, most compilers do similar already...
That said, I suggest against this. It seems like a good idea on paper, and on very limited systems it is, but modern operating systems are not "limited systems" by that definition. Virtual memory already resolves the biggest reason to do this, memory fragmentation (which you did not mention). Many compiler allocators will attempt to more or less do what you are trying to do here in the standard library containers by drawing from memory pools, and those are far more manageable to use.
I once implemented a system just like this, but for many good reasons have ditched it in favor of a collection of unordered maps of pointers. I have plans to replace allocators if I discover performance or memory problems associated with this model. This lets me offset the concern of managing memory until testing/optimization, and doesn't require quirky system design at every level to handle abstraction.
When I say "quirky", believe me when I say that there are many more annoyances with the indirection-pool-stack design than I have listed.

shared_ptr in vector trouble - iterating and losing scope - getting corrupted data

I'm not new to C++ but I do mostly work in C# and other managed languages usually so I'm not that well versed in shared pointers etc.
I basically have a 3-dimensional map of shared_ptrs to objects of a custom class (for 3d purposes).
These shared_ptrs live inside the map and are referenced all over the project, so far so good. For a specific piece of functionality I store some of these shared_ptrs in a vector to be iterated over later in the code but this is where things seem to break down.
Say I have 100 of these objects with their pointers stored in the 3d map and 3 of them are added to the vector because they have special properties for functionality. When I'm traversing the vector I call functions on these 3 objects. This all works fine until the for loop (for the iteration) hits it's second run on element [1] of the vector, at this point element [0] is full of corrupted data (but element[1] needs to access element[0] to perform one of the functions) and this is where I get an error.
As I said, I'm not that well versed in shared_ptrs but I thought that the data wouldn't be corrupted because the actual object is created in the creation of the 3-dimensional map - also, the vector hasn't gone out of scope yet because I am only in the 2nd run through the for loop (with 3 objects and so 3 iterations).
I know I must be doing something wrong but I have no idea how to debug this further - basically when I step through the code and it comes to the 2nd run of the for loop, if I look at element [0]'s shared_ptr in the vector (using VS) it is all corrupted.
Here is the loop of the code so you can see. The vector is created in the constructor of the class this code is in and the map (where the shared_ptrs and objects are created) is in the main class of the application. Also, getAdjacent takes objA and objB as pointers and so "fills" objB with the data of the adjacent object to objA:
for(vector<shared_ptr<ObjectClass>>::iterator iterator = objects.begin(); iterator != objects.end(); iterator++)
{
shared_ptr<ObjectClass> objA = (shared_ptr<ObjectClass>) iterator->get();
shared_ptr<ObjectClass> objB;
m_3DMap->getAdjacent(objA, objB);
objA->move(objB);
}
Could it be something to do with the cast I perform on iterator->get()? I couldn't see any other way to do it because if I don't have that cast there VS says that it can't convert from ObjectClass* to shared_ptr which is confusiong to me too because I thought I have a vector of shared_ptrs?..
Thanks for your time and it would be great if anyone could help.
For starters, you should call get() on a shared_ptr very rarely. And using that pointer to init another shared_ptr you MUST NOT do, ever.
The raw pointer you construct the smart pointer from is passed for ownership. And ownership must be exclusive. The way you use it two shared_ptrs will own the same object resulting a premature, and later a double delete at some point, both putting you to UB land.
This is somewhat better than your original, but without seeing code chances are good that getAdjacent is also broken in some way.
for(auto i = objects.begin(); i != objects.end(); ++i)
{
const auto& objA = *i;
shared_ptr<ObjectClass> objB;
m_3DMap->getAdjacent(objA, objB);
objA->move(objB);
}
signature of getAdjacent should be something like
void getAdjacent(shared_ptr<ObjectClass> const& objA, shared_ptr<ObjectClass>& objB);
to have fighting chance.
Okay I found the problem I had.
It was in part to do with what Balog Pal said about putting the shared_ptrs in a vector even though they live in a map so I changed this so that the vector held pointers to the shared_ptrs. Although this may still be bad practice, it got around the issue - especially as the vector was local and lost scope after the function exited.
The original vector approach was implemented as I was getting a dereference on my map iterator somewhere in the function and so I decide to leave all map manipulation outside the iterating (hence storing the 'special' objects in the vector to be iterated over later).
I have now changed the implementation to get rid of this inefficiency as my manipulation functions now return a new map iterator and thus negating the problem.
It would have been nice if the error message was somewhat more helpful that it was because it took me some time to realize the iterator became invalidated because I had an insert() buried in one of the functions that was called on the 'special' objects.
Although this answer doesn't really answer the exact question I asked, it answers the problem as a whole and why I designed the function in the way I did in the question.
Thanks for the help Balog and itwasntpete for the answers and comments.

C++: What are scenarios where using pointers is a "Good Idea"(TM)? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Common Uses For Pointers?
I am still learning the basics of C++ but I already know enough to do useful little programs.
I understand the concept of pointers and the examples I see in tutorials make sense to me. However, on the practical level, and being a (former) PHP developer, I am not yet confident to actually use them in my programs.
In fact, so far I have not felt the need to use any pointer. I have my classes and functions and I seem to be doing perfectly fine without using any pointer (let alone pointers to pointers). And I can't help feeling a bit proud of my little programs.
Still, I am aware that I am missing on one of C++'s most important feature, a double edged one: pointers and memory management can create havoc, seemingly random crashes, hard to find bugs and security holes... but at the same time, properly used, they must allow for clever and efficient programming.
So: do tell me what I am missing by not using pointers.
What are good scenarios where using pointers is a must?
What do they allow you to do that you couldn't do otherwise?
In which way to they make your programs more efficient?
And what about pointers to pointers???
[Edit: All the various answers are useful. One problem at SO is that we cannot "accept" more than one answer. I often wish I could. Actually, it's all the answers combined that help to understand better the whole picture. Thanks.]
I use pointers when I want to give a class access to an object, without giving it ownership of that object. Even then, I can use a reference, unless I need to be able to change which object I am accessing and/or I need the option of no object, in which case the pointer would be NULL.
This question has been asked on SO before. My answer from there:
I use pointers about once every six lines in the C++ code that I write. Off the top of my head, these are the most common uses:
When I need to dynamically create an object whose lifetime exceeds the scope in which it was created.
When I need to allocate an object whose size is unknown at compile time.
When I need to transfer ownership of an object from one thing to another without actually copying it (like in a linked list/heap/whatever of really big, expensive structs)
When I need to refer to the same object from two different places.
When I need to slice an array without copying it.
When I need to use compiler intrinsics to generate CPU-specific instructions, or work around situations where the compiler emits suboptimal or naive code.
When I need to write directly to a specific region of memory (because it has memory-mapped IO).
Pointers are commonly used in C++. Becoming comfortable with them, will help you understand a broader range of code. That said if you can avoid them that is great, however, in time as your programs become more complex, you will likely need them even if only to interface with other libraries.
Primarily pointers are used to refer to dynamically allocated memory (returned by new).
They allow functions to take arguments that cannot be copied onto the stack either because they are too big or cannot be copied, such as an object returned by a system call. (I think also stack alignment, can be an issue, but too hazy to be confident.)
In embedded programing they are used to refer to things like hardware registers, which require that the code write to a very specific address in memory.
Pointers are also used to access objects through their base class interfaces. That is if I have a class B that is derived from class A class B : public A {}. That is an instance of the object B could be accessed as if it where class A by providing its address to a pointer to class A, ie: A *a = &b_obj;
It is a C idiom to use pointers as iterators on arrays. This may still be common in older C++ code, but is probably considered a poor cousin to the STL iterator objects.
If you need to interface with C code, you will invariable need to handle pointers which are used to refer to dynamically allocated objects, as there are no references. C strings are just pointers to an array of characters terminated by the nul '\0' character.
Once you feel comfortable with pointers, pointers to pointers won't seem so awful. The most obvious example is the argument list to main(). This is typically declared as char *argv[], but I have seen it declared (legally I believe) as char **argv.
The declaration is C style, but it says that I have array of pointers to pointers to char. Which is interpreted as a arbitrary sized array (the size is carried by argc) of C style strings (character arrays terminated by the nul '\0' character).
If you haven't felt a need for pointers, I wouldn't spend a lot of time worrying about them until a need arises.
That said, one of the primary ways pointers can contribute to more efficient programming is by avoiding copies of actual data. For example, let's assume you were writing a network stack. You receive an Ethernet packet to be processed. You successively pass that data up the stack from the "raw" Ethernet driver to the IP driver to the TCP driver to, say, the HTTP driver to something that processes the HTML it contains.
If you're making a new copy of the contents for each of those, you end up making at least four copies of the data before you actually get around to rendering it at all.
Using pointers can avoid a lot of that -- instead of copying the data itself, you just pass around a pointer to the data. Each successive layer of the network stack looks at its own header, and passes a pointer to what it considers the "payload" up to the next higher layer in the stack. That next layer looks at its own header, modifies the pointer to show what it considers the payload, and passes it on up the stack. Instead of four copies of the data, all four layers work with one copy of the real data.
A big use for pointers is dynamic sizing of arrays. When you don't know the size of the array at compile time, you will need to allocate it at run-time.
int *array = new int[dynamicSize];
If your solution to this problem is to use std::vector from the STL, they use dynamic memory allocation behind the scenes.
There are several scenarios where pointers are required:
If you are using Abstract Base Classes with virtual methods. You can hold a std::vector and loop through all these objects and call a virtual method. This REQUIRES pointers.
You can pass a pointer to a buffer to a method reading from a file etc.
You need a lot of memory allocated on the heap.
It's a good thing to care about memory problems right from the start. So if you start using pointers, you might as well take a look at smart pointers, like boost's shared_ptr for example.
What are good scenarios where using pointers is a must?
Interviews. Implement strcpy.
What do they allow you to do that you couldn't do otherwise?
Use of inheritance hierarchy. Data structures like Binary trees.
In which way to they make your programs more efficient?
They give more control to the programmer, for creating and deleting resources at run time.
And what about pointers to pointers???
A frequently asked interview question. How will you create two dimensional array on heap.
A pointer has a special value, NULL, that reference's won't. I use pointers wherever NULL is a valid and useful value.
I just want to say that i rarely use pointers. I use references and stl objects (deque, list, map, etc).
A good idea is when you need to return an object where the calling function should free or when you dont want to return by value.
List<char*>* fileToList(char*filename) { //dont want to pass list by value
ClassName* DataToMyClass(DbConnectionOrSomeType& data) {
//alternatively you can do the below which doesnt require pointers
void DataToMyClass(DbConnectionOrSomeType& data, ClassName& myClass) {
Thats pretty much the only situation i use but i am not thinking that hard. Also if i want a function to modify a variable and cant use the return value (say i need more then one)
bool SetToFiveIfPositive(int**v) {
You can use them for linked lists, trees, etc.
They're very important data structures.
In general, pointers are useful as they can hold the address of a chunk of memory. They are especially useful in some low level drivers where they are efficiently used to operate on a piece of memory byte by byte. They are most powerful invention that C++ inherits from C.
As to pointer to pointer, here is a "hello-world" example showing you how to use it.
#include <iostream>
void main()
{
int i = 1;
int j = 2;
int *pInt = &i; // "pInt" points to "i"
std::cout<<*pInt<<std::endl; // prints: 1
*pInt = 6; // modify i, i = 6
std::cout<<i<<std::endl; // prints: 6
int **ppInt = &pInt; // "ppInt" points to "pInt"
std::cout<<**ppInt<<std::endl; // prints: 6
**ppInt = 8; // modify i, i = 8
std::cout<<i<<std::endl; // prints: 8
*ppInt = &j; // now pInt points to j
*pInt = 10; // modify j, j = 10
std::cout<<j<<std::endl; // prints: 10
}
As we see, "pInt" is a pointer to integer which points to "i" at the beginning. With it, you can modify "i". "ppInt" is a pointer to pointer which points to "pInt". With it, you can modify "pInt" which happens to be an address. As a result, "*ppInt = &j" makes "pInt" points to "j" now. So we have all the results above.

C++: Updating pointers in deep copy (efficiently)

My question is best illustrated with a code sample, so let's just start off with that:
class Game
{
// All this vector does is establish ownership over the Card objects
// It is initialized with data when Game is created and then is never
// changed.
vector<shared_ptr<Card> > m_cards;
// And then we have a bunch of pointers to the Cards.
// All these pointers point to Cards from m_cards.
// These could have been weak_ptrs, but at the moment, they aren't
vector<Card*> m_ptrs;
// Note: In my application, m_ptrs isn't there, instead there are
// pointers all over the place (in objects that are stored in member
// variables of Game.
// Also, in my application, each Card in m_cards will have a pointer
// in m_ptrs (or as I said, really just somewhere), while sometimes
// there is more than one pointer to a Card.
}
Now what I want to do is to make a deep copy of this Game class. I make a new vector with new shared_ptrs in it, which point to new Card objects which are copies of the original Card objects. That part is easy.
Then the trouble starts, the pointers of m_ptrs should be updated to point to the cards in m_cards, which is no simple task.
The only way I could think of to do this is to create a map and fill it during the copying of m_cards (with map[oldPtr] = newPtr) and then to use that to update m_ptrs. However, this is only O(m * log(n)) (m = m_ptrs.size(); n = m_cards.size()). As this is going to be a pretty regular operation* I would like to do this efficiently, and I have the feeling that it should be possible in O(m) using custom pointers. However, I can't seem to find an efficient way of doing this. Anybody who does?
*it's used to create a testbed for the AI, letting it "try out" different moves
Edit: I would like to add a bit on accepting an answer, as I haven't yet. I am waiting until I get back to this project (I got on a side track as I had worked too much on this project - if you do it for fun it's got to stay fun), so it may be a while longer before I accept an answer. Nevertheless, I will accept an answer some time, so don't worry :P
Edit nr 2: I still haven't gotten back to this project. Right now, I am thinking about just taking the O(m * log(n)) way and not complaining, then seeing later if it needs to be faster. However, as I have recently taken some time to learn my patterns, I am also thinking that I really need to refactor this project some time. Oh, and that I might just spend some time working on this problem with all the new knowledge I have under my belt. Since there isn't an answer that says "just stick with the hashmap and see later if it really needs to be faster" (and I would actually be pretty disappointed if there was, as it's not an answer to my question), I am postponing the picking of an answer yet a bit more till I do get back to this project.
Edit nr 3: I still didn't get back to this project. More precisely, it has been shelved indefinitely. I am pretty sure I just wouldn't get my head too bent over the O(m * log(n))right now, and then perhaps look at it later if it turned out to be a problem. However, that would just not have been a good answer to my question, as I explicitly asked for better performance. Not wanting to leave the answers unaccepted any longer, I chose the most helpful answer and accepted it.
Store the pointers as indexes.
As you say they all point to m_Cards which is a vector that can be indexed (is that correct English?).
Either you do that only for storing and convert them back to pointers at loading.
Or you may think of using indices generally instead of pointers.
What about keeping cards elements index instead of pointer:
vector<int> m_indexes;
...
Card* ptr = &m_cards[m_indexes[0]];
Vector with indexes can be copied without changes.
I recently encountered the very similar problem: cloning the class internal structure implemented by pointers and std::vector as an objects storage.
First of all (unrelated to the question though), I'd suggest either stick with smart pointers or with plain structures. In your case it means that it makes much more sense to use vector<weak_ptr<Card> > m_ptrs instead of raw pointers.
About the question itself - one more possible workaround is using pointer differences in the copy constructor. I will demonstrate it for vector of objects, but working with shared pointers will utilize the same principle, the only difference will be in copying of m_cards (you should not simply use assignment if you want objects clones but copy the m_cards element-by-element).
It is very important that the method works only for containers where the elements are guaranteed to be stored consequently (vector, array).
Another very important moment is that the m_ptrs elements should represent only internal Card structrure, i. e. they must point only to the internal m_cards elements.
// assume we store objects directly, not in shared pointers
// the only difference for shared pointers will be in
// m_cards assignment
// and using m_cards[...].get() instead of &m_cards[...]
vector<Card> m_cards;
vector<Card*> m_ptrs;
In that case your array of pointers can be easily computed by using pointers arithmetic taking linear time. In that case your copy constructor will look like this:
Game::Game(const Game &rhs) {
if (rhs.m_cards.empty())
return;
m_cards = rhs.m_cards;
// if we have vector of shared pointers
// and we need vector of pointers to objects clones
// something like this should be done
// for (auto p: rhs.m_cards) {
// // we must be certain here that dereferencing is safe,
// // i. e. object must exist. If not, additional check is required.
// // copy constructor will be called here:
// m_cards.push_back(std::make_shared<Card>(*p));
// }
Card *first = &rhs.m_cards[0];
for (auto p: rhs.m_ptrs) {
m_ptrs.push_back(&m_cards[p - first]);
}
}
Basically in this deepcopy method you will be still working with indexes, but you preserve the convenience of working with pointers in other class methods without storing your indexes separately.
Anyway, for using that kind of structure you should exactly know what you are doing with the class members and why, that requires much more manual control (for example, at least adding/removing elements to/from m_cards should be done consciously, in other case m_ptrs can easily become broken even without copying the object).