Memory-efficient two-way-mapping in C++

Memory-efficient two-way-mapping in C++ - c++

I need to create a two-way mapping from ints to an object. I can't use the boost::bimap because my objects are modified after being placed in the mapping (they are being modified in ways that do not affect the mapping).
The simple solution is two use a vector and an unordered_map:
vector<MyClass> _vector;
unordered_map<MyClass, size_t> _map;
However, this maintains two copies of each MyClass, and I don't want that.
I can keep MyClass * pointers in one of the containers, and use the storage of the other, but I'm afraid either vector or unordered_map can move the instances around (when reallocating the vector, or resizing the hash table).
Any help would be appreciated.

And can't you just store your objects in one vector and save the mapped indexes in another?
std::vector<MyClass> vC;
std::vector<unsigned int> vM;
Then vC[vM[i]] is the mapped class of the vC[i] object.
Though if you give more details about what you are trying to do (is the map reflexive? All classes have a mapped class or just some? How often you need to modify your objects?) we could help a bit more.

You can use a std::shared_ptr<MyClass> in your main container, and std::weak_ptr<MyClass> in the referencing one.
Unfortunately you didn't give enough context or requirements, to give you a concise example. Anyway, you should also have some synchronization management, that removes entries from the referencing container, as soon these are erased from the main container, though std::weak_ptr makes this easier to implement.
Not only this is more memory efficient, it also relieves you from keeping otherwise unrelated copies of MyClass in sync.

Related

Relationships between C++ objects

I have a vector of journeys and a vector of locations. A journey is between two places.
struct Data {
std::vector<Journey> m_journeys;
std::vector<Locations> m_locations;
};
struct Journey {
?? m_startLocation;
?? m_endLocation;
};
How can I create the relationship between each journey and two locations?
I thought I could just store references/pointers to the start and end locations, however if more locations are added to the vector, then it will reallocate storage and move all the locations elsewhere in memory, and then the pointers to the locations will point to junk.
I could store the place names and then search the list in Data, but that would require keeping a reference to Data (breaking encapsulation/SRP), and then a not so efficient search.
I think if all the objects were created on the heap, then shared_ptr could be used, (so Data would contain std::vector<std::shared_ptr<Journey>>), then this would work? (it would require massive rewrite so avoiding this would be preferable)
Is there some C++/STL feature that is like a pointer but abstracts away/is independent of memory location (or order in the vector)?

No, there isn't any "C++/STL feature that is like a pointer but abstracts away/is independent of memory location".
That answers that.
This is simply not the right set of containers for such a relationship between classes. You have to pick the appropriate container for your objects first, instead of selecting some arbitrary container first, and then trying to figure out how to make it work with your relationship.
Using a vector of std::shared_ptrs would be one option, just need to watch out for circular references. Another option would be to use std::list instead of std::vector, since std::list does not reallocate when it grows.
If each Locations instance has a unique identifier of some kind, using a std::map, and then using that location identifier to refer to a location, and then looking it up in the map. Although a std::map also doesn't reallocate upon growth, the layer of indirection offers some value as well.

I'd say make a vector<shared_ptr<Location>>for your index of locations, and Journey would contain two weak_ptr<Location>.
struct Data {
std::vector<Journey> m_journeys;
std::vector<std::shared_ptr<Location>> m_locations;
};
struct Journey {
std::weak_ptr<Location> m_startLocation;
std::weak_ptr<Location> m_endLocation;
};
std::weak_ptr can dangle and that's exactly what you want. :)
The concern is that one could access a Journey containing a deleted Location. A weak pointer provides an expired() method that can tell you if the data of the parent shared pointer (that would be in your m_locations vector) still exists.
Accessing data from a weak pointer is safe, and will require the use of the lock() method.
Here is a great example of how one usually uses a weak pointer:
http://en.cppreference.com/w/cpp/memory/weak_ptr/lock

QMap insert QVector<QString> by pointer or value?

I want to build up a map of devices such that the map contains:
QString 'DeviceID' and QVector 'Command List'
Currently I have the QMap as follows:
QMap<QString, QVector<QString> *> devices;
QVector<QString> *pCommands= new QVector<QString>;
// :
// Fill pCommands with lots of data here
// :
devices.insert(RadioID, pCommands);
But I am wondering if this is actually any better then:
QMap<QString, QVector<QString>> devices;
QVector<QString> commands;
// :
// Fill commands with lots of data here
// :
devices.insert(RadioID, commands);
I am sure that I read somewhere that Qt does something quite efficient when copying data. I am not seeing many people using pointers with Qt and it seems messy that I have to go through the QMap deleting all the QVector's at the end...
I am using c++11, so maybe some kind of move semantic may work here?
EDIT
I have modified the comments in the code to show that the vector is not empty.
Also I would state that I do not need to change the data once it is stored.

There is no reason to consider manually allocating the vectors to be better.
Sure, you only need to copy a pointer, rather than a vector, but an empty vector is still quite fast to copy. The biggest gain of storing objects rather than pointers is that you don't need to do manual memory management.
I am using c++11, so maybe some kind of move semantic may work here?
If QMap::insert supports move semantics, and if QVector is move-constructible like their standard library counterparts, then you could indeed move the vector into the map. But moving an empty vector is just as fast as copying.
If QMap has an emplace like function std::map does, then you could even construct the vector in-place without even a move.
I'm not familiar with Qt, though so you'll need to verify those details from the documentation. Whether Qt supports move semantics doesn't change the fact that manual memory management is a pain.
Edit: according to the documentation QVector appears to be movable, but QMap does not support move semantics. However, as Arpegius and the documentation point out, QVector does copy-on-write optimization, so as long as the copied vector is not modified, then the data won't be copied. None of this matters really, when copying an empty vector.
Edit again
If the added vectors are full of data, then copying is indeed quite expensive unless it remains unmodified. Moving would remedy that, but QMap appears not to support that. There is another trick, though: Insert an empty vector, and then swap the full vector with the empty one in the map.

The simplest and pretty much idiomatic way to do it would be:
QMap<QString, QVector<QString>> devices;
// Get the vector constructed in the map itself.
auto & commands = devices[RadioID];
// Fill commands with lots of data here - a bit faster
commands.resize(2);
commands[0] = "Foo";
commands[1] = "Bar";
// Fill commands with lots of data here - a bit slower
commands.append("Baz");
commands.append("Fan");
You'll see this pattern often in Qt and other code, and it's the simplest way to do it that works equally well on C++98 and C++11 :)
This way you're working on the vector that's already in the map. The vector itself is never copied. The members of the vector are copied, but they are implicitly shared, so copying is literally a pointer copy and an atomic reference count increment.

shared_ptr with vector

I currently have vectors such as:
vector<MyClass*> MyVector;
and I access using
MyVector[i]->MyClass_Function();
I would like to make use of shared_ptr. Does this mean all I have to do is change my vector to:
typedef shared_ptr<MyClass*> safe_myclass
vector<safe_myclass>
and I can continue using the rest of my code as it was before?

vector<shared_ptr<MyClass>> MyVector; should be OK.
But if the instances of MyClass are not shared outside the vector, and you use a modern C++11 compiler, vector<unique_ptr<MyClass>> is more efficient than shared_ptr (because unique_ptr doesn't have the ref count overhead of shared_ptr).

Probably just std::vector<MyClass>. Are you
working with polymorphic classes or
can't afford copy constructors or have a reason you can't copy and are sure this step doesn't get written out by the compiler?
If so then shared pointers are the way to go, but often people use this paradigm when it doesn't benefit them at all.
To be complete if you do change to std::vector<MyClass> you may have some ugly maintenance to do if your code later becomes polymorphic, but ideally all the change you would need is to change your typedef.
Along that point, it may make sense to wrap your entire std::vector.
class MyClassCollection {
private : std::vector<MyClass> collection;
public : MyClass& at(int idx);
//...
};
So you can safely swap out not only the shared pointer but the entire vector. Trade-off is harder to input to APIs that expect a vector, but those are ill-designed as they should work with iterators which you can provide for your class.
Likely this is too much work for your app (although it would be prudent if it's going to be exposed in a library facing clients) but these are valid considerations.

Don't immediately jump to shared pointers. You might be better suited with a simple pointer container if you need to avoid copying objects.

c++ vector construct with given memory

I'd like to use a std::vector to control a given piece of memory. First of all I'm pretty sure this isn't good practice, but curiosity has the better of me and I'd like to know how to do this anyway.
The problem I have is a method like this:
vector<float> getRow(unsigned long rowIndex)
{
float* row = _m->getRow(rowIndex); // row is now a piece of memory (of a known size) that I control
vector<float> returnValue(row, row+_m->cols()); // construct a new vec from this data
delete [] row; // delete the original memory
return returnValue; // return the new vector
}
_m is a DLL interface class which returns an array of float which is the callers responsibility to delete. So I'd like to wrap this in a vector and return that to the user.... but this implementation allocates new memory for the vector, copies it, and then deletes the returned memory, then returns the vector.
What I'd like to do is to straight up tell the new vector that it has full control over this block of memory so when it gets deleted that memory gets cleaned up.
UPDATE: The original motivation for this (memory returned from a DLL) has been fairly firmly squashed by a number of responders :) However, I'd love to know the answer to the question anyway... Is there a way to construct a std::vector using a given chunk of pre-allocated memory T* array, and the size of this memory?

The obvious answer is to use a custom allocator, however you might find that is really quite a heavyweight solution for what you need. If you want to do it, the simplest way is to take the allocator defined (as the default scond template argument to vector<>) by the implementation, copy that and make it work as required.
Another solution might be to define a template specialisation of vector, define as much of the interface as you actually need and implement the memory customisation.
Finally, how about defining your own container with a conforming STL interface, defining random access iterators etc. This might be quite easy given that underlying array will map nicely to vector<>, and pointers into it will map to iterators.
Comment on UPDATE: "Is there a way to construct a std::vector using a given chunk of pre-allocated memory T* array, and the size of this memory?"
Surely the simple answer here is "No". Provided you want the result to be a vector<>, then it has to support growing as required, such as through the reserve() method, and that will not be possible for a given fixed allocation. So the real question is really: what exactly do you want to achieve? Something that can be used like vector<>, or something that really does have to in some sense be a vector, and if so, what is that sense?

Vector's default allocator doesn't provide this type of access to its internals. You could do it with your own allocator (vector's second template parameter), but that would change the type of the vector.
It would be much easier if you could write directly into the vector:
vector<float> getRow(unsigned long rowIndex) {
vector<float> row (_m->cols());
_m->getRow(rowIndex, &row[0]); // writes _m->cols() values into &row[0]
return row;
}
Note that &row[0] is a float* and it is guaranteed for vector to store items contiguously.

The most important thing to know here is that different DLL/Modules have different Heaps. This means that any memory that is allocated from a DLL needs to be deleted from that DLL (it's not just a matter of compiler version or delete vs delete[] or whatever). DO NOT PASS MEMORY MANAGEMENT RESPONSIBILITY ACROSS A DLL BOUNDARY. This includes creating a std::vector in a dll and returning it. But it also includes passing a std::vector to the DLL to be filled by the DLL; such an operation is unsafe since you don't know for sure that the std::vector will not try a resize of some kind while it is being filled with values.
There are two options:
Define your own allocator for the std::vector class that uses an allocation function that is guaranteed to reside in the DLL/Module from which the vector was created. This can easily be done with dynamic binding (that is, make the allocator class call some virtual function). Since dynamic binding will look-up in the vtable for the function call, it is guaranteed that it will fall in the code from the DLL/Module that originally created it.
Don't pass the vector object to or from the DLL. You can use, for example, a function getRowBegin() and getRowEnd() that return iterators (i.e. pointers) in the row array (if it is contiguous), and let the user std::copy that into its own, local std::vector object. You could also do it the other way around, pass the iterators begin() and end() to a function like fillRowInto(begin, end).
This problem is very real, although many people neglect it without knowing. Don't underestimate it. I have personally suffered silent bugs related to this issue and it wasn't pretty! It took me months to resolve it.
I have checked in the source code, and boost::shared_ptr and boost::shared_array use dynamic binding (first option above) to deal with this.. however, they are not guaranteed to be binary compatible. Still, this could be a slightly better option (usually binary compatibility is a much lesser problem than memory management across modules).

Your best bet is probably a std::vector<shared_ptr<MatrixCelType>>.
Lots more details in this thread.

If you're trying to change where/how the vector allocates/reallocates/deallocates memory, the allocator template parameter of the vector class is what you're looking for.
If you're simply trying to avoid the overhead of construction, copy construction, assignment, and destruction, then allow the user to instantiate the vector, then pass it to your function by reference. The user is then responsible for construction and destruction.
It sounds like what you're looking for is a form of smart pointer. One that deletes what it points to when it's destroyed. Look into the Boost libraries or roll your own in that case.

The Boost.SmartPtr library contains a whole lot of interesting classes, some of which are dedicated to handle arrays.
For example, behold scoped_array:
int main(int argc, char* argv[])
{
boost::scoped_array<float> array(_m->getRow(atoi(argv[1])));
return 0;
}
The issue, of course, is that scoped_array cannot be copied, so if you really want a std::vector<float>, #Fred Nurk's is probably the best you can get.
In the ideal case you'd want the equivalent to unique_ptr but in array form, however I don't think it's part of the standard.

When to use pointers in C++

I just started learning about pointers in C++, and I'm not very sure on when to use pointers, and when to use actual objects.
For example, in one of my assignments we have to construct a gPolyline class, where each point is defined by a gVector. Right now my variables for the gPolyline class looks like this:
private:
vector<gVector3*> points;
If I had vector< gVector3 > points instead, what difference would it make? Also, is there a general rule of thumb for when to use pointers? Thanks in advance!

The general rule of thumb is to use pointers when you need to, and values or references when you can.
If you use vector<gVector3> inserting elements will make copies of these elements and the elements will not be connected any more to the item you inserted. When you store pointers, the vector just refers to the object you inserted.
So if you want several vectors to share the same elements, so that changes in the element are reflected in all the vectors, you need the vectors to contain pointers. If you don't need such functionality storing values is usually better, for example it saves you from worrying about when to delete all these pointed to objects.

Pointers are generally to be avoided in modern C++. The primary purpose for pointers nowadays revolves around the fact that pointers can be polymorphic, whereas explicit objects are not.
When you need polymorphism nowadays though it's better to use a smart pointer class -- such as std::shared_ptr (if your compiler supports C++0x extensions), std::tr1::shared_ptr (if your compiler doesn't support C++0x but does support TR1) or boost::shared_ptr.

Generally, it's a good idea to use pointers when you have to, but references or alternatively objects objects (think of values) when you can.
First you need to know if gVector3 fulfils requirements of standard containers, namely if the type gVector3 copyable and assignable. It is useful if gVector3 is default constructible as well (see UPDATE note below).
Assuming it does, then you have two choices, store objects of gVector3 directly in std::vector
std::vector<gVector3> points;
points.push_back(gVector(1, 2, 3)); // std::vector will make a copy of passed object
or manage creation (and also destruction) of gVector3 objects manually.
std::vector points;
points.push_back(new gVector3(1, 2, 3));
//...
When the points array is no longer needed, remember to talk through all elements and call delete operator on it.
Now, it's your choice if you can manipulate gVector3 as objects (you can assume to think of them as values or value objects) because (if, see condition above) thanks to availability of copy constructor and assignment operator the following operations are possible:
gVector3 v1(1, 2, 3);
gVector3 v2;
v2 = v1; // assignment
gVector3 v3(v2); // copy construction
or you may want or need to allocate objects of gVector3 in dynamic storage using new operator. Meaning, you may want or need to manage lifetime of those objects on your own.
By the way, you may be also wondering When should I use references, and when should I use pointers?
UPDATE: Here is explanation to the note on default constructibility. Thanks to Neil for pointing that it was initially unclear. As Neil correctly noticed, it is not required by C++ standard, however I pointed on this feature because it is an important and useful one. If type T is not default constructible, what is not required by the C++ standard, then user should be aware of potential problems which I try to illustrate below:
#include <vector>
struct T
{
int i;
T(int i) : i(i) {}
};
int main()
{
// Request vector of 10 elements
std::vector<T> v(10); // Compilation error about missing T::T() function/ctor
}

You can use pointers or objects - it's really the same at the end of the day.
If you have a pointer, you'll need to allocate space for the actual object (then point to it) any way. At the end of the day, if you have a million objects regardless of whether you are storing pointers or the objects themselves, you'll have the space for a million objects allocated in the memory.
When to use pointers instead? If you need to pass the objects themselves around, modify individual elements after they are in the data structure without having to retrieve them each and every time, or if you're using a custom memory manager to manage the allocation, deallocation, and cleanup of the objects.
Putting the objects themselves in the STL structure is easier and simpler. It requires less * and -> operators which you may find to be difficult to comprehend. Certain STL objects would need to have the objects themselves present instead of pointers in their default format (i.e. hashtables that need to hash the entry - and you want to hash the object, not the pointer to it) but you can always work around that by overriding functions, etc.
Bottom line: use pointers when it makes sense to. Use objects otherwise.

Normally you use objects.
Its easier to eat an apple than an apple on a stick (OK 2 meter stick because I like candy apples).
In this case just make it a vector<gVector3>
If you had a vector<g3Vector*> this implies that you are dynamically allocating new objects of g3Vector (using the new operator). If so then you need to call delete on these pointers at some point and std::Vector is not designed to do that.
But every rule is an exception.
If g3Vector is a huge object that costs a lot to copy (hard to tell read your documentation) then it may be more effecient to store as a pointer. But in this case I would use the boost::ptr_vector<g3Vector> as this automatically manages the life span of the object.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js