I have a class myClass. A myClass object has a (human readable) name, and some more information.
class myClass
{
std::string name;
int attribute;
int anotherAttribute;
}
They are stored inside a STL container, a vector, for example.
std::vector<myObject> myList;
When the client wants to access an element, it does this by name. That means, I have to iterate over the whole vector to find the correct object (the container contains about a few hundred objects).
So, I'm thinking of moving to std::map as container, instead of vector. As far as my understanding is, a map should be the container of choice when accessing elements by name, instead of an index.
However, then the name of the object is stored twice, once as the map key, and in the object itself. The memory overhead should be no problem, but I wonder if this is good practise. There may be the problem that the names run out of sync (for some mysterious reason). I even thought about dropping the name member of myClass.
To make it short: what container should I use, and why?
You should choose your container based on the way your store and access data in it. So in your usecase you should definitely use a unordered_map
You should keep the std::string name attribute in your class if and only if you use it from inside the class.
So the question is will you at any point have to get the name from the object, rather than the object from its name.
This occurs when you get the object from the container and store it elsewhere, then use it later on.
Given the nature of name (which is most likely not going to change) you don't really have to worry about the name attribute not being "in sync" with the key you use in the unordered_map
If memory isn't a primary concern I'd advise you to keep it.
Related
I have an std::vector<std::unique_ptr<SomeClass>> variable inside a private section of a class. Other parts of the program uses the methods of that class to add elements to the vector. It works, but now I have got a new requirement to allow users of the class remove some elements from the vector as well.
My problem is: I still want to hide the vector from the outside world to keep encapsulation of the class. I thought my methods could just return an iterator to the elements in the vector, but I've read the C++ reference about it and they say, if a vector changes its size, all iterators made before are invalidated. So, my second idea was to return an index of the newly added element, but it is also not good for obvious reasons.
So, my question is: how to make an persistent reference to an object inside a vector, to be used for deleting the object, without exposing the internals of my class too much?
Easy and fast solution:
Change std::vector<X>, to std::map<int, X> or std::unordered_map<int, X>
When adding element generate unique id and return it to user. Add element using id as key in map
When user want to access/delete element he should provde id
I have a vector of journeys and a vector of locations. A journey is between two places.
struct Data {
std::vector<Journey> m_journeys;
std::vector<Locations> m_locations;
};
struct Journey {
?? m_startLocation;
?? m_endLocation;
};
How can I create the relationship between each journey and two locations?
I thought I could just store references/pointers to the start and end locations, however if more locations are added to the vector, then it will reallocate storage and move all the locations elsewhere in memory, and then the pointers to the locations will point to junk.
I could store the place names and then search the list in Data, but that would require keeping a reference to Data (breaking encapsulation/SRP), and then a not so efficient search.
I think if all the objects were created on the heap, then shared_ptr could be used, (so Data would contain std::vector<std::shared_ptr<Journey>>), then this would work? (it would require massive rewrite so avoiding this would be preferable)
Is there some C++/STL feature that is like a pointer but abstracts away/is independent of memory location (or order in the vector)?
No, there isn't any "C++/STL feature that is like a pointer but abstracts away/is independent of memory location".
That answers that.
This is simply not the right set of containers for such a relationship between classes. You have to pick the appropriate container for your objects first, instead of selecting some arbitrary container first, and then trying to figure out how to make it work with your relationship.
Using a vector of std::shared_ptrs would be one option, just need to watch out for circular references. Another option would be to use std::list instead of std::vector, since std::list does not reallocate when it grows.
If each Locations instance has a unique identifier of some kind, using a std::map, and then using that location identifier to refer to a location, and then looking it up in the map. Although a std::map also doesn't reallocate upon growth, the layer of indirection offers some value as well.
I'd say make a vector<shared_ptr<Location>>for your index of locations, and Journey would contain two weak_ptr<Location>.
struct Data {
std::vector<Journey> m_journeys;
std::vector<std::shared_ptr<Location>> m_locations;
};
struct Journey {
std::weak_ptr<Location> m_startLocation;
std::weak_ptr<Location> m_endLocation;
};
std::weak_ptr can dangle and that's exactly what you want. :)
The concern is that one could access a Journey containing a deleted Location. A weak pointer provides an expired() method that can tell you if the data of the parent shared pointer (that would be in your m_locations vector) still exists.
Accessing data from a weak pointer is safe, and will require the use of the lock() method.
Here is a great example of how one usually uses a weak pointer:
http://en.cppreference.com/w/cpp/memory/weak_ptr/lock
Please help to figure out the logic of using unordered_set with custom structures.
Consider I have following class
struct MyClass {
int id;
// other members
};
used with shared_ptr
using CPtr = std::shared_ptr<MyClass>;
Because of fast access by key I supposed to use an unordered_set with a custom hash and the MyClass::id member as a key):
template <class T> struct CHash;
template<> struct CHash<CPtr>
{
std::size_t operator() (const CPtr& c) const
{
return std::hash<decltype(c->id)> {} (c->id);
}
};
using std::unordered_set<CPtr, CHash>;
Right now, unordered_set still seems to be an appropriate container. However standard find() functions for sets are assumed to be const to ensure keys won't be changed. I intend to change objects guaranteeing keeping keys unchanged. So, the questions are:
1) How to realize easy accessing to element of set by int key reserving possibility to change element, something like
auto element = my_set.find(5);
element->b = 3.3;
It is possible to add converting constructor and use something like
auto element = my_set.find(MyClass (5));
But it doesn't solve the problem with constness and what if the class is huge.
2) Am I actually going wrong way? Should I use another container? For example unordered_map, that will store one more int key for each entry consuming more memory.
A pointer doesn't project its constness to the object it points to. Meaning, if you have a constant reference to a std::shared_ptr (as in a set) you can still modify the object via this pointer. Whether or not that is something you should do a is a different question and it doesn't solve your lookup problem.
OF course, if you want to lookup a value by a key, then this is what std::unordered_map was designed for so I'd have a closer look there. The main problem I see with this approach is not so much the memory overhead (unordered_set and unordered_map as well as shared_ptr have noticeable memory overhead anyway), but that you have to maintain redundant information (id in the object and id as a key).
If you have not many insertions and you don't absolutely need the (on average) constant lookup time and memory overhead is really important to you, you could consider a third solution (besides using a third-party or self written data structure of courses): namely to write a thin wrapper around a sorted std::vector<std::shared_ptr<MyClass>> or - if appropriate - even better std::vector<std::unique_ptr<MyClass>> that uses std::upper_bound for lookups.
I think you are going a wrong way using unordered_set,because unordered_set's definition is very clear that:
Keys are immutable, therefore, the elements in an unordered_set cannot be modified once in the container - they can be inserted and removed, though.
You can see its definition in site:
http://www.cplusplus.com/reference/unordered_set/unordered_set/.
And hope it is helpful for you.Thanks.
I am trying to use a map container to hold Shapes and match those shapes to an ID number.
Until now, I have always used STL containers to hold and memory-manage my objects. So I would use containers of these sorts:
std::map<int, Square> squares;
std::map<int, Triangle> triangles;
std::map<int, Circle> circles;
But I want to have a single map to hold "Shapes", and this is an abstract base class of Square, Triangle and Circle. So to achieve this, I would still store the realisable derived class objects in their own maps, and then have another map:
std::map<int, Shape*> shapes;
to store pointers to the objects stored in the other maps.
This seems very messy though, and I'd rather store all the objects in a single polymorphic map that owns and memory-manages the contained objects.
After reading a little about Boost's ptr_map I expected this was the solution. But it seems that the base class needs to be realisable as while trying to use:
boost::ptr_map<int,Shape> shapes;
I get the error: "error: cannot allocate an object of abstract type 'Shape'"
Must I make the base class realisable? That would be a bit of a hack so I'd rather do this properly, if there is such a way.
My next best guess about how to do this is to use a container such as:
std::map<int, boost::shared_ptr<Shape> > shapes;
This seems like such a straightforward aim, but since I'm having such difficulty with it I suspect I'm trying to do something I shouldn't. So any advice about where I might be going wrong is greatly appreciated.
Thanks.
ptr_map<int, Shape> seems like the way to go, and works even with abstract base type (see an example here). I guess the error you obtain comes from the use of operator[]. Indeed, like in std::map, the operator[] returns a default-constructed value if the key was not already in the map. In this case, it can't construct the value since Shape is abstract, hence the compiler error.
So you can use ptr_map but not the indexing operator. When inserting, use insert, when looking up a key, use find.
Your base class does not need to be realisable. See the example here: http://www.boost.org/doc/libs/1_49_0/libs/ptr_container/doc/tutorial.html#associative-containers
The animal there does have abstract functions! In fact, this is one of the primary uses for ptr containers. Your error is probably caused elsewhere and you need to post more code.
Also:
This seems very messy though, and I'd rather store all the objects in a single polymorphic map that owns and memory-manages the contained objects.
I don't think it is. It can actually be beneficial to never "lose" the type of the actual object, in case you need that later. For example, if you wanted to do something for all Triangles, but not all other shapes, this will come in handy.
The disadvantage is of course that you need to keep all the maps in sync, but this is very easy to solve: Stick those 4 maps (and preferably no other data members!) into the private section of a class and implement 3 overloads for the different types and always insert them into the strongly typed "owning" map and into the map with polymorphic pointers. I will be very easy to keep them in sync behind a small class interface. Just don't try to do it as an unstructured part of something bigger, or the code will start to look messy.
I have two lines of code I want explained a bit please. As much as you can tell me. Mainly the benefits of each and what is happening behind the scenes with memory and such.
Here are two structs as an example:
struct Employee
{
std::string firstname, lastname;
char middleInitial;
Date hiringDate; // another struct, not important for example
short department;
};
struct Manager
{
Employee emp; // manager employee record
list<Employee*>group; // people managed
};
Which is better to use out of these two in the above struct and why?
list<Employee*>group;
list<Employee>group;
First of all, std::list is a doubly-linked list. So both those statements are creating a linked list of employees.
list<Employee*> group;
This creates a list of pointers to Employee objects. In this case there needs to be some other code to allocate each employee before you can add it to the list. Similarly, each employee must be deleted separately, std::list will not do this for you. If the list of employees is to be shared with some other entity this would make sense. It'd probably be better to place the employee in a smart pointer class to prevent memory leaks. Something like
typedef std::list<std::shared_ptr<Employee>> EmployeeList;
EmployeeList group;
This line
list<Employee>group;
creates a list of Employee objects by value. Here you can construct Employee objects on the stack, add them to the list and not have to worry about memory allocation. This makes sense if the employee list is not shared with anything else.
One is a list of pointers and the other is a list of objects. If you've already allocated the objects, the first makes sense.
You probably want to use the second one, if you store the "people managed" to be persisted also in another location. To elaborate: if you also have a global list of companyEmployees you probably want to have pointers, as you want to share the object representing an employee between the locations (so that, for example, if you update the name the change is "seen" from both locations).
If instead you only want to know "why a list of structs instead of a list of pointers" the answer is: better memory locality, no need to de-allocate the single Employee objects, but careful that every assignement to/from a list node (for example, through an iterator and its * operator) copies the whole struct and not just a pointer.
The first one stores the objects by pointer. In this case you need to carefully document who owns the allocated memory and who's responsible for cleaning it up when done. The second one stores the objects by value and has full control of their lifespan.
Which one to use depends on context you haven't given in your question although I favor the second slightly as a default because it doesn't leave open the possibility of mismanaging your memory.
But after all that, carefully consider if list is actually the right container choice for you. Typically it's a low-priority container that satisfies very specific needs. I almost always favor vector and deque first for random access containers, or set and map for ordered containers.
If you do need to store pointers in the container, boost provides ptr-container classes that manage the memory for you, or I suggest storing some sort of smart pointer so that the memory is cleaned up automatically when the object isn't needed anymore.
A lot depends on what you are doing. For starters, do you really want
Manager to contain an Employee, rather than to be one: the classical
example of a manager (one of the classic OO examples) would be:
struct Manager : public Employee
{
list<Employee*> group;
};
Otherwise, you have the problem that you cannot put managers into the
group of another manager; you're limited to one level in the management
hierarchy.
The second point is that in order to make an intelligent decision, you
have to understand the role of Employee in the program. If Employee
is just a value: some hard data, typically immutable (except by
assignment of a complete Employee), then list<Employee> group is
definitely to be preferred: don't use pointers unless you have to. If
Employee is a "entity", which models some external entity (say an
employee of the firm), you would generally make it uncopyable and
unassignable, and use list<Employee*> (with some sort of mechanism to
inform the Manager when the employee is fired, and the pointed to
object is deleted). If managers are employees, and you don't want to
loose this fact when they are added to a group, then you have to use the
pointer version: polymorphism requires pointers or references to work
(and you can't have a container of references).
The two lists are good, but they will require a completely different handling.
list<Employee*>group;
is a list of pointers to objects of type Employee and you will store there pointers to objects allocated dynamically, and you will need to be particularly clear as to who will delete those objects.
list<Employee>group;
is a list of objects of type Employee; you get the benefit (and associated cost in terms of performance) of dealing with concrete instances that you do not need to memory manage yourself.
Specifically, one of the advantages of using std::list compared to a plain array, is that you can have a list of objects and avoid the cost and risks of dealing with dynamic memory allocation and pointers.
With a list of objects, you can do, e. g.
Employee a; // object allocated in the stack
list.push_back(a); // the list does a copy for you
Employee* b = new Employee....
list.push_back(*b); // the object pointed is copied
delete b;
With a list of pointers you are forced at using always dynamic allocation, in practice, or refer to object whose lifetime is longer than the list's (if you can guarantee it).
By using a std::list of pointers, you are more or less in the same situation as when using a plain array of pointers as far as memory management is concerned. The only advantage you get is that the list can grow dynamically without effort on your part.
I personally don't see much sense in using a list of pointers; basically, because I think that pointers should be used (always, when possible) through smart pointers. So, if you really need pointers, you will be better off, IMO, using a list of smart pointers provided by boost.
Use the first one if you're allocating or accessing the structures separately.
Use the second one if you'll only be allocating/accessing them through the list.
First one defines a list of pointers to objects, the second a list of objects.
The first version (with pointers) is preferred by most of the programmers.
The main reason is that STL is copying elements by value making sorting and internal reallocation more efficient.
You probably want to use unique_ptr<> or auto_ptr<> or shared_ptr<> rather then plain old * pointers. This goes some if not the whole way of having both the expected use without much of the memory issues with using non-heap objects...