Background/Example:
I currently have a class like the following:
class Element {
Large l1;
OtherLarge l2;
Small s1;
VerySmall s2;
};
where Large and OtherLarge are relatively big (~80 bytes) whereas Small and VerySmall are fairly small (~4 to 16 Byte).
On these elements, I operate in two ways:
sorting them in various ways. During this, only members s1 and s2 are accessed/needed.
combining the large members in various way (e.g. matrix-matrix multiplication).
The second class of operations is already fairly fast and can be parallelised easily, hence I’d like to speed up the first class of operations more. Comparing with another class Element2 where I replaced the two big data members by 8-byte integers doing nothing tells me that if I can somehow replace the direct data members l1 and l2 by pointers of one sort or another to dynamically-allocated elements elsewhere, I’ll already get a big win.
For reference, all member types have both copy and move constructors and can be both copied and moved, but moving them is much much cheaper. Large and OtherLarge also allocate a lot of memory by themselves, so allocating a bit more isn’t necessarily horrible.
Concrete question
Is it possible, and if so, what is the best way, to replace a direct member object of a class with a pointer to a dynamically-allocated object elsewhere; preserving the behaviour of a direct member as closely as possible w.r.t construction, destruction, member variable access etc? If I use a std::unique_ptr<Large> naively, I assume I’ll have to dereference it half the time/take care of copying specially? Ideally I’d like the new member object to behave just as if the old, big member object was still there.
unique_ptr will not solve your issue as sorting may involve copying (for swapping). I am fairly certain that flyweight pattern can solve your issues. Here is a simple implementation:
class Element_data {
public:
large l1,l2;
};
std::vector<Element_data> data;
class Element {
public:
small s1, s2;
int data_ind;
large &GetLarge1() {
assert(data_ind>=0 && data_ind<data.size);
return data[data_ind].l1;
}
};
std::vector<Element> elements;
...
std::sort(elements.begin(), elements.end(), &mysortfn);
Why not use pointers? They may get invalidated if you add new members to the data vector. Also this approach allows you to keep your data together so it might get loaded into cache easier.
Additional points:
If you are deleting an element for good, you should erase the data as well.
Adding new member is quite simple
While sorting or some other operations, it is safe to have more than one Element to point to the same data
Edit: Just to make sure you won't run into problems if it is not clear, destructor of Element should not destroy data. You can provide custom deleter for this. The best approach would be to develop a container that can do this, erase the element and data with erase function.
Related
Let's assume that I have a class
class Foo
{
public:
Foo (const std::string&);
virtual ~Foo()=default;
private:
//some private properties
};
And I want to create many instances of this class. Since I aim for good performance, I want to allocate the memory at once for all of them (at this point, I know the exact number but only at runtime). However, each object shall be constructed with an individual constructor parameter from a vector of parameters
std::vector<std::string> parameters;
Question: How can this be achieved?
My first try was to start with a std::vector<Foo> and then reserve(parameters.size()) and use emplace_back(...) in a loop. However I cannot use this approach because I use pointers to the individual objects and want to be sure that they are not moved to a different location in memory by the internal methods of std::vector. To avoid this I tried to delete the copy constructor of Foo to be sure at compile time that no methods can be called that might copy the objects to a different location but then I cannot use emplace_back(...) anymore. The reason is that in this method, the vector might want to grow and copy all the elements to the new location, it does not know that I reserved enough space.
I see three possibilities:
Use vector with reserve + emplace_back. You have the guarantee that your elements don't get moved as long as you don't exceed the capacity.
Use malloc + placement new. This allows you to allocate raw memory and then construct each element one by one e.g. in a loop.
If you already have a range of parameters from which to construct you objects as in the example, you can brobably (depending on your implementation of std::vector) use std::vector's iterator based constructor like this:
std::vector<Foo> v(parameters.begin(),parameters.end());
First solution has the advantage to be much simpler and has all the other goodies of a vector like taking care of destruction, keeping the size around etc.
The second solution might be faster, because you don't need to do the housekeeping stuff of vector emplace_back and it works even with a deleted move / copy constructor if that is important to you, but it leaves you with dozens of possibilities for errors
The third solution - if applicable - is imho the best. It also works with deleted copy / move constructors, should not have any performance overhead and it gives you all the advantages of using a standard container.
It does however rely on the constructor first determining the size of the range (e.g. via std::distance) and I'm not sure if this is guaranteed for any kind of iterators (in practice, all implementations do this at least for random access iterators). Also in some cases, providing appropriate iterators requires writing some boilerplate code.
How would one go about creating a vector that includes both the base class as well as any derived classes?
For example, in a chess engine, I currently have a Move class which stores a particular move and a few functions to help it. In order to save memory, as millions of these objects are going to be created, I also have a derived class CaptureMove that extends the Move class storing a bit more information about what and where the piece was captured.
From what I can gather, pointers to Move objects should work, but I'm not quite sure on how to go about it.
The question is quite broad. Here some ideas:
Vectors of base pointers:
This works extremely well if your class is polymorphic (i.e. the relevant functions of the base class are virtual).
vector<Move*> mp;
mp.push_back (new Move); // attention, you have to delete it ofr memory will leak
mp.push_back (new CaptureMove);
It the simplest way to proceed. However you have to make sure that when you add an object, it's allocated properly (e.g. created with new), and that once you no longer need it, you delete it. This can be very cumbersome, especially if vector was copied and some of its pointers are still in use.
This approach can be practical for example if you create and delete the objects in a centralised manner, so that the vector only uses pointers which are properly managed somewhere else.
Vector of shared base pointers:
vector<shared_ptr<Move>> m;
m.push_back(make_shared<Move>());
m.push_back(make_shared<CaptureMove>());
m.push_back(make_shared<Move>());
Here an online demo.
It extends the pointer solution, using smart pointers to take care of the release of unused objects.
Honestly, it's a little overhead but it's really worth it, in order to have reliable code. This is the approach I would take personnally if I'd have to do it.
Vector of compound object
You could also prefer to store the object instead of a pointer to the object. While the idea seems simple, it's more difficult to do, because different derivates could have different size. And it has serious drawbacks, because you'd need to know all possible base and derived types you may store in the vector, which makes this approach less flexible.
You could certainly manage this with a complex union, but the easiers way would be to use boost::variant.
vector<boost::variant<Move, CaptureMove>> m;
This approach is only worth considering if the number of derived classes is very limited, but you have huge numbers of small objects (so that memory allocation would become a real overhead) of almost the same size.
I have been given some code to read which does some geometric operations on meshes.
A mesh data structure, by definition, should contain at least the information
regarding the coordinates of points, edge connectivity and face information.
So, the code given to me has classes to define vertex, edge and face data structure,
named respectively as Vertex, Edge and Face.
However the mesh class looks like this.
class basemesh
{
public:
/* Methods to operate on the protected data below.*/
protected:
/*! list of edges */
std::list<Edge*> m_edges;
/*! list of vertices */
std::list<Vertex*> m_verts;
/*! list of faces */
std::list<Face*> m_faces;
}
My question: Why does the mesh data structure store a list of pointers rather than a
list of the corresponding objects themselves.
e.g why not say directly std::list<Vertex>
I have seen this construct being used in a couple of other C++ codes
Does this have something to do with inheritance of classes? Or is it something to do
with performance with regards to iterating on the list?
This basemesh class is, as the name suggests, a base class from which
other specialized meshes are derived.
There is no performance reasons here. Its simply a case of ownership sharing. Remember this as a rule of thumb: Pointers in C++ are used to share/pass ownership of a resource, or to provide polymorphic behaviour through dynamic binding.
People is talking about performence because you avoid copying the things. Blah, blah, blah.
If you need to copy, you should copy. The only reason why its using pointers is because the author didn't want to copy the things when he/she copies the list of things, in other words, he/she wants to maintain the same things in two locations (lists): Ownership sharing, as I said before.
On the other hand, note that the class is called basemesh. So the real point of the pointers here could be to work with polymorphic vertices, edges, etc (Dynamic binding).
NOTE: If performance was the point here, I'm pretty sure the author would be using compact and aligned non-cache-miss-prone std::vector instead of std::list. In this case, the most presumable reason about the use of pointers is polymorphism, not performance. Anything related to pointers, dereferencing, and transversing linked lists will always have less performance than compact data, exactly what std::vector<Vertex> is, for example. Again, if the use of pointers is not for polymorphism, is for ownership related things, not performance.
Other note: Copying Yes, you are copying. But note what and how are copying. Vertices are, except of a very rare implementation, pairs of floats/ints. There is no gain at all about copying 64bits of floats vs 32/64bits of pointers.
Also note that, except you don't be so lucky, you are copying things stored at the same cache line, or almost at the cache.
A good rule about optimization nowadays is: Try to optimize memory accesses, not CPU cicles. I recommend this thread: What is "cache-friendly" code?, and this for a practical case: Why are elementwise additions much faster in separate loops than in a combined loop?. Finally, this thread contains good notes about optimizing using modern compilers.
My guess is that it's either made for a very unusual specific case, but more likely, it's written by a programmer who doesn't know how heap allocations or std::list actually work, and just blindly use pointers.
It seems very unlikely a std::list of pointers to single vertices was the best option performance- or designwise.
On a practical level if a method changes a point it does not need to reproduce the change in the other data structures. They will all point to the same thing.
But in terms of memory management it would be wise to use smart pointers,
At a guess I'd say it's so that these objects can have pointers to each other (e.g. an Edge can have pointers to two Vertices, each of which can have a pointer back to the Edge).
If all the Vertices lived in a std::list in basemesh, then pointers to them would not be reliable, although list::iterators might work well enough.
Using pointers is less efficient when retrieving inner data in general because you will have to dereference the value every time you access it.
But at the same time it will be more efficient when passing data around, since you are just passing pointers. I guess the solution chosen is related to the fact that data is shared between multiple objects by composition. Eg: multiple Edge instances could refer to same Vertex.
Now std::list guarantees that addresses to values contained are consistent until the element itself is removed so actually doing something like
Edge(const Vertex *v1, const Vertex *v2) { .. }
std::list<Vertex>::iterator it = std::advance(vertices.begin(), 3);
std::list<Vertex>::iterator it2 = std::advance(vertices.begin(), 5);
new Edge(&(*it), &(*it2));
Would work since addresses won't be invalidated so there is no real necessity to use pointers to store objects. Actually by using this solution you don't need to care about memory management of single objects since you won't need to delete them or wrap them into smart pointers.
It's using pointers for performance reasons and to reduce the chance of an error.
Imagine the alternative of not using pointers. Every insertion into class basemesh would cause a copy of the object to be created, and every time you access an object, if you aren't careful, you'll get a copy as well.
For example, imagine this statement:
Edge e = m_edges[0];
e.doSomethingThatModifiesState();
In this example, without pointers, you'll have a copy of the object, and any operations you perform on it will not affect the actual edge object stored in m_edges.
With pointers, you don't have this issue:
Edge* e = m_edges[0];
e->doSomethingThatModifiesState();
In this example, no copy of the object is made, and when you do something, you get the intended behavior.
As many others said the speed is the most obvious reason. Another reason is to get polymorphic behavior through pointers to the base class.
I am building a C++ class A that needs to contain a bunch of pointers to other objects B.
In order to make the class as general as possible, I am using a std::vector<B*> inside this class. This way any number of different B can be held in A (there are no restrictions on how many there can be).
Now this might be a bit of overkill because most of the time, I will be using objects of type A that only hold either 2 or 4 B*'s in the vector.
Since there is going to be a lot of iterative calculations going on, involving objects of class A, I was wondering if there is a lot of overhead involved in using a vector of B's when there are only two B's needed.
Should I overload the class to use another container when there are less than 3 B present?
to make things clearer: A are multipoles and B are magnetic coils, that constitute the multipoles
Premature optimization. Get it working first. If you profile your application and see that you need more efficiency (in memory or performance), then you can change it. Otherwise, it's a potential waste of time.
I would use a vector for now, but typedef a name for it instead of spelling std::vector out directly where it's used:
typedef std::vector vec_type;
class A {
vec_type<B*> whatever;
};
Then, when/if it becomes a problem, you can change that typedef name to refer to a vector-like class that's optimized for a small number of contained objects (e.g., does something like the small-string optimization that's common with many implementations of std::string).
Another possibility (though I don't like it quite as well) is to continue to use the name "vector" directly, but use a using declaration to specify what vector to use:
class A {
using std::vector;
vector<B*> whatever;
};
In this case, when/if necessary, you put your replacement vector into a namespace, and change the using declaration to point to that instead:
class A {
using my_optimized_version::vector;
// the rest of the code remains unchanged:
vector<B*> whatever;
};
As far as how to implement the optimized class, the typical way is something like this:
template <class T>
class pseudo_vector {
T small_data[5];
T *data;
size_t size;
size_t allocated;
public:
// ...
};
Then, if you have 5 or fewer items to store, you put them in small_data. When/if your vector contains more items than that fixed limit, you allocate space on the heap, and use data to point to it.
Depending a bit on what you're trying to optimize, you may want to use an abstract base class, with two descendants, one for small vectors and the other for large vectors, with a pimpl-like class to wrap them and make either one act like something you can use directly.
Yet another possibility that can be useful for some situations is to continue to use std::vector, but provide a custom Allocator object for it to use when obtaining storage space. Googling for "small object allocator" should turn up a number of candidates that have already been written. Depending on the situation, you may want to use one of those directly, or you may want to use them as inspiration to write your own.
If you need an array of B* that will never change its size, you won't need the dynamic shrinking and growing abilities of the std::vector.
So, probably not for reasons of efficiency, but for reasons of intuition, you could consider using a fixed length array:
struct A {
enum { ndims = 2 };
B* b[ndims];
};
or std::array (if available):
struct A {
std::array<B*, 2> b;
};
see also this answer on that topic.
Vectors are pretty lean as far as overhead goes. I'm sure someone here can give more detailed information about what that really means. But if you've got performance issues, they're not going to come from vector.
In addition I'd definitely avoid the tactic of using different containers depending on how many items there are. That's just begging for a disaster and won't really give you anything in return.
My question is best illustrated with a code sample, so let's just start off with that:
class Game
{
// All this vector does is establish ownership over the Card objects
// It is initialized with data when Game is created and then is never
// changed.
vector<shared_ptr<Card> > m_cards;
// And then we have a bunch of pointers to the Cards.
// All these pointers point to Cards from m_cards.
// These could have been weak_ptrs, but at the moment, they aren't
vector<Card*> m_ptrs;
// Note: In my application, m_ptrs isn't there, instead there are
// pointers all over the place (in objects that are stored in member
// variables of Game.
// Also, in my application, each Card in m_cards will have a pointer
// in m_ptrs (or as I said, really just somewhere), while sometimes
// there is more than one pointer to a Card.
}
Now what I want to do is to make a deep copy of this Game class. I make a new vector with new shared_ptrs in it, which point to new Card objects which are copies of the original Card objects. That part is easy.
Then the trouble starts, the pointers of m_ptrs should be updated to point to the cards in m_cards, which is no simple task.
The only way I could think of to do this is to create a map and fill it during the copying of m_cards (with map[oldPtr] = newPtr) and then to use that to update m_ptrs. However, this is only O(m * log(n)) (m = m_ptrs.size(); n = m_cards.size()). As this is going to be a pretty regular operation* I would like to do this efficiently, and I have the feeling that it should be possible in O(m) using custom pointers. However, I can't seem to find an efficient way of doing this. Anybody who does?
*it's used to create a testbed for the AI, letting it "try out" different moves
Edit: I would like to add a bit on accepting an answer, as I haven't yet. I am waiting until I get back to this project (I got on a side track as I had worked too much on this project - if you do it for fun it's got to stay fun), so it may be a while longer before I accept an answer. Nevertheless, I will accept an answer some time, so don't worry :P
Edit nr 2: I still haven't gotten back to this project. Right now, I am thinking about just taking the O(m * log(n)) way and not complaining, then seeing later if it needs to be faster. However, as I have recently taken some time to learn my patterns, I am also thinking that I really need to refactor this project some time. Oh, and that I might just spend some time working on this problem with all the new knowledge I have under my belt. Since there isn't an answer that says "just stick with the hashmap and see later if it really needs to be faster" (and I would actually be pretty disappointed if there was, as it's not an answer to my question), I am postponing the picking of an answer yet a bit more till I do get back to this project.
Edit nr 3: I still didn't get back to this project. More precisely, it has been shelved indefinitely. I am pretty sure I just wouldn't get my head too bent over the O(m * log(n))right now, and then perhaps look at it later if it turned out to be a problem. However, that would just not have been a good answer to my question, as I explicitly asked for better performance. Not wanting to leave the answers unaccepted any longer, I chose the most helpful answer and accepted it.
Store the pointers as indexes.
As you say they all point to m_Cards which is a vector that can be indexed (is that correct English?).
Either you do that only for storing and convert them back to pointers at loading.
Or you may think of using indices generally instead of pointers.
What about keeping cards elements index instead of pointer:
vector<int> m_indexes;
...
Card* ptr = &m_cards[m_indexes[0]];
Vector with indexes can be copied without changes.
I recently encountered the very similar problem: cloning the class internal structure implemented by pointers and std::vector as an objects storage.
First of all (unrelated to the question though), I'd suggest either stick with smart pointers or with plain structures. In your case it means that it makes much more sense to use vector<weak_ptr<Card> > m_ptrs instead of raw pointers.
About the question itself - one more possible workaround is using pointer differences in the copy constructor. I will demonstrate it for vector of objects, but working with shared pointers will utilize the same principle, the only difference will be in copying of m_cards (you should not simply use assignment if you want objects clones but copy the m_cards element-by-element).
It is very important that the method works only for containers where the elements are guaranteed to be stored consequently (vector, array).
Another very important moment is that the m_ptrs elements should represent only internal Card structrure, i. e. they must point only to the internal m_cards elements.
// assume we store objects directly, not in shared pointers
// the only difference for shared pointers will be in
// m_cards assignment
// and using m_cards[...].get() instead of &m_cards[...]
vector<Card> m_cards;
vector<Card*> m_ptrs;
In that case your array of pointers can be easily computed by using pointers arithmetic taking linear time. In that case your copy constructor will look like this:
Game::Game(const Game &rhs) {
if (rhs.m_cards.empty())
return;
m_cards = rhs.m_cards;
// if we have vector of shared pointers
// and we need vector of pointers to objects clones
// something like this should be done
// for (auto p: rhs.m_cards) {
// // we must be certain here that dereferencing is safe,
// // i. e. object must exist. If not, additional check is required.
// // copy constructor will be called here:
// m_cards.push_back(std::make_shared<Card>(*p));
// }
Card *first = &rhs.m_cards[0];
for (auto p: rhs.m_ptrs) {
m_ptrs.push_back(&m_cards[p - first]);
}
}
Basically in this deepcopy method you will be still working with indexes, but you preserve the convenience of working with pointers in other class methods without storing your indexes separately.
Anyway, for using that kind of structure you should exactly know what you are doing with the class members and why, that requires much more manual control (for example, at least adding/removing elements to/from m_cards should be done consciously, in other case m_ptrs can easily become broken even without copying the object).