Multi-index on boost::ptr_vector - c++

I have the following classes in a program.
class Class1 {
public:
boost::ptr_vector<Class2> fields;
}
class Class2 {
public:
std:string name;
unsigned int value;
}
I want to write a member function in Class1 that returns a reference or pointer to an element in fields based on Class2's name variable. I don't have to be concerned with the lifetime of the objects in the container.
Currently, I am returning an iterator to the element I want after the function searches from the start of the vector to the element.
boost::ptr_vector<Class2>::iterator getFieldByName(std::string name) {
boost::ptr_vector<Class2>::iterator field = fields.begin();
while (field != fields.end()) {
if (field->name.compare(name) == 0) {
return field;
}
++field;
}
return fields.end();
}
The problems that I'm facing are:
(1.) I need to have fast random access to the elements or the program sits in getFieldByName() too long (a boost::ptr_vector<> is too slow when starting at the beginning of the container)
(2.) I need to preserve the order of insertion of the fields (so I can't use a boost::ptr_map<> directly)
I have discovered Boost::MultiIndex and it seems like it could provide a solution to the problems, but I need to use a smart container so that destruction of the container will also destruct the objects owned by the container.
Is there anyway to achieve a smart container that has multiple methods of access?

You can use two containers. Have a boost::ptr_map<> that stores the actual data, and then have a std::vector<> that stores pointers to the nodes of the map.
boost::ptr_map<std::string, Class2> by_field;
std::vector<Class2 const*> by_order;
void insert(Class2* obj) {
if (by_field.insert(obj->name, obj).second) {
// on insertion success, also add to by_order
by_order.push_back(obj);
}
}
This will give you O(lg n) access in your getFieldByName() function (just look it up in by_field) while also preserving the order of insertion (just look it up in by_order).

Related

C++ N-last added items container

I try to find optimal data structure for next simple task: class which keeps N last added item values in built-in container. If object obtain N+1 item it should be added at the end of the container and first item should be removed from it. It like a simple queue, but class should have a method GetAverage, and other methods which must have access to every item. Unfortunately, std::queue doesn't have methods begin and end for this purpose.
It's a part of simple class interface:
class StatItem final
{
static int ITEMS_LIMIT;
public:
StatItem() = default;
~StatItem() = default;
void Reset();
void Insert(int val);
int GetAverage() const;
private:
std::queue<int> _items;
};
And part of desired implementation:
void StatItem::Reset()
{
std::queue<int> empty;
std::swap(_items, empty);
}
void StatItem::Insert(int val)
{
_items.push(val);
if (_items.size() == ITEMS_LIMIT)
{
_items.pop();
}
}
int StatItem::GetAverage() const
{
const size_t itemCount{ _items.size() };
if (itemCount == 0) {
return 0;
}
const int sum = std::accumulate(_items.begin(), _items.end(), 0); // Error. std::queue doesn't have this methods
return sum / itemCount;
}
Any ideas?
I'm not sure about std::deque. Does it work effective and should I use it for this task or something different?
P.S.: ITEMS_LIMIT in my case about 100-500 items
The data structure you're looking for is a circular buffer. There is an implementation in the Boost library, however in this situation since it doesn't seem you need to remove items you can easily implement one using a std::vector or std::array.
You will need to keep track of the number of elements in the vector so far so that you can average correctly until you reach the element limit, and also the current insertion index which should just wrap when you reach that limit.
Using an array or vector will allow you to benefit from having a fixed element limit, as the elements will be stored in a single block of memory (good for fast memory access), and with both data structures you can make space for all elements you need on construction.
If you choose to use a std::vector, make sure to use the 'fill' constructor (http://www.cplusplus.com/reference/vector/vector/vector/), which will allow you to create the right number of elements from the beginning and avoid any extra allocations.

C++ Tree Data Structure

Background:
So I've been porting some of my older Java code to C++, and I've come across an issue that's making proceeding quite difficult. My project uses a tree data-structure to represent the node hierarchy for 3D animation.
Java:
public final class Node {
private final Node mParent;
private final ArrayList<Node> mChildren;
//private other data, add/remove children / parents, etc ...
}
In Java, its quite simple to create a tree that allows for modification etc.
Problem:
I'm running into issues is with C++, arrays cannot easily be added to without manually allocating a new chunk of memory and having the existing ones moved over so I switched to std::vector. Vectors have the issue of doing what I just described internally making any pointers to there elements invalid. So basically if you wan't to use pointers you need a way to back them so memory holding the actual nodes doesn't move. I herd you can use std::shared_ptr/std::unique_ptr to wrap the nodes in the std::vector, and I tried to play around with that approach but it becomes quite unwieldy. Another option would be to have a "tree" class that wraps the node class and is the interface to manipulate it, but than (for my use case) it would be quite annoying to deal with cutting branches off and making them into there own trees and possibly attaching different branches.
Most examples I see online are Binary trees that have 2 nodes rather than being dynamic, or they have many comments about memory leaks / etc. I'm hoping there's a good C++ alternative to the java code shown above (without memory leak issues etc). Also I won't be doing ANY sorting, the purpose of the tree is to maintain the hierarchy not to sort it.
Honestly I'm really unsure of what direction to go, I've spent the last 2 days trying different approaches but none of them "feel" right, and are usually really awkward to manage, any help would be appreciated!
Edit:
An edit as to why shared_ptrs are unwieldy:
class tree : std::enable_shared_from_this<tree> {
std::shared_ptr<tree> parent;
std::vector<std::shared_ptr<tree>> children;
public:
void set_parent(tree& _tree) {
auto this_shared_ptr = shared_from_this();
if (parent != nullptr) {
auto vec = parent->children;
auto begin = vec.begin();
auto end = vec.end();
auto index = std::distance(begin, std::find_if(begin, end, [&](std::shared_ptr<tree> const& current) -> bool {
return *current == this_shared_ptr;
}));
vec.erase(std::remove(begin, end, index), end);
}
parent = std::shared_ptr<tree>(&_tree);
if (parent != nullptr) {
parent->children.push_back(this_shared_ptr);
}
}
};
working with pointers like above becomes really quite verbose, and I was hoping for a more simple solution.
You could store your nodes in a single vector and use relative pointers that are not changed when the vectors are resized:
typedef int32_t Offset;
struct Node {
Node(Offset p) : parent(p) {}
Offset parent = 0; // 0 means no parent, so root node
std::vector<Offset> children;
};
std::vector<Node> tree;
std::vector<uint32_t> free_list;
To add a node:
uint32_t index;
if (free_list.empty()) {
index = tree.size();
tree.emplace_back(parent_index - tree.size());
} else {
index = free_list.back();
free_list.pop_back();
tree[index].parent = parent_index - index;
}
tree[parent_index].children.push_back(index - parent_index);
To remove a node:
assert(node.children.empty());
if (node.parent) {
Node* parent = &node + node.parent;
auto victim = find(parent->children.begin(), parent->children.end(), -node.parent);
swap(*victim, parent->children.back()); // more efficient than erase from middle
parent->children.pop_back();
}
free_list.push_back(&node - tree.data());
The only reason for the difference you're seeing is if you put the objects directly in the vector itself in c++ (which you cannot do in Java.) Then their addresses are bound to the current allocated buffer in the vector. The difference is in Java, all the objects themselves are allocated, so only an "object reference" is actually in the array. The equivalent in c++ would be to make a vector of pointers (hopefully wrapped in smart pointer objects) so the vector elements only are an address, but the objects live in fixed memory. It adds an extra pointer hop, but then would behave more like what you expect in java.
struct X {
char buf[30];
};
std::vector<X> myVec{ X() };
Given the above, the X elements in myVec are contiguous, in the allocation. sizeof(myVec[0]) == sizeof(X). But if you put pointers in the vector:
std::vector<unique_ptr<X>> myVec2{ make_unique<X>() };
This should behave more like what you want, and the pointers will not become invalid when the vector resizes. The pointers will merely be copied.
Another way you could do this would be to change things a little in your design. Consider an alternate to pointers entirely, where your tree contains a vector of elements, and your nodes contain vectors of integers, which are the index into that vector.
vector, forward_list, ..., any std container class (other than built-in array or std::array) may be used.
Your trouble seems to be that java classes are refrence types, while C++ classes are value types. The snippet below triggers "infinite recursion" or "use of incomplete type" error at compiletime:
class node{
node mParent;//trouble
std::vector<node> children;
//...
};
the mParent member must be a reference type. In order to impose reference semantics you can make it a raw pointer:
node* mParent;
you may also use pointer as the argument type to the container, but as a C++ beginer that would most probably lead to memory leaks and wierd runtime errors. we should try to stay away from manual memory management for now. So the I modify your snippet to:
class node{
private:
node* const mParent;
std::vector<node> children;
public:
//node(node const&)=delete;//do you need copies of nodes? you have to properly define this if yes.
node(node *parent):
mParent{parent}{};
void addChild(/*???*/){
children.emplace_back(this);
//...
};
//...
};

Own vector class for arduino (c++)

I added also void Clear()-method.
https://redstoner.com/forums/threads/840-minimal-class-to-replace-std-vector-in-c-for-arduino
https://forum.arduino.cc/index.php?topic=45626.0
I'm asking about this Vector class.
void push_back(Data const &x) {
if (d_capacity == d_size) resize();
d_data[d_size++] = x;
}; // Adds new value. If needed, allocates more space
How to add "insert"-method to this Vector class (arduino use C++ but not have a standard vector methods)?
Vector<Sensor*> sensors;
I have a another class Sensor and I use vector like this.
push.back(new Sensor (1,1,"Sensor_1",2));
Is it possible to add values one by one to this vector class? And how to do it?
I like to ask also other question.
How can I call delete/call destructor for this Vector "sensors" so all pointers are deleted? Or sensors vector is deleted? I want to clear the data and then add data to it.
If you want to add an item to the end of the vector, use the push_back method you've quoted above. If you want to add an item somewhere else in the vector, you'll need to add your own method which re-sizes if necessary, shifts the elements above the insert location up one place and then copies the new element into the correct slot. Something like this (untested):
void insert_at(size_t idx, Data const &data) {
assert(idx < d_size);
if (d_capacity == d_size) {
resize();
}
for (size_t i = d_size; i > idx; --i) {
d_data[i] = std::move(d_data[i - 1]);
}
d_data[idx] = data;
++d_size;
}
As Nacho points out, you might be better off with a linked list if you're going to do a lot of these insert operations, especially if the data you're storing is large and/or has a complex move operator.

Writing C++ API - how to keep external references to API internal objects?

So I'm writing an API in C++ to be used in another GUI application I'll be writing. The API will allow the user to create instances of "MyObject" and modify the properties of that object, but the object itself will not be exposed to the client, only an ID to that object. So for instance:
Object_ID identifier = myApiCreateObject();
myApiModifyProperty(identifier, "PROPERTY_NAME", "value");
So the identifier acts as an external handler to a specific MyObject instance.
As of right now the Object_ID is defined as follows:
typedef int Object_ID;
Currently all MyObject instances are stored in an std::vector within my API. The Object_ID is simply the index in the vector that the desired instance lives.
The problem with this approach is that I don't know how to handle deleting instances of MyObject from the vector. For instance, let's say I have 10 instances of MyObject created and I want to delete the instance at index 5, I would want to do something like the following:
myApiDeleteObject(handlerForIndex5);
By doing this though, internally my API would remove that object from the std::vector and then would have to shift over all the objects at indices > 5. This would cause my external handlers to no longer reference the correct object.
So just using the index of the array by itself is not sufficient, but I don't know of a better alternative without having to expose the MyObject class to the client.
EDIT
Here's an updated example highlighting the issue at hand:
Internally the API performs certain algorithms on the list of objects, some of these algorithms require sorting the vector as a step.
So my GUI would do something like :
myApiBeginCalculations();
and then internally the API would be doing something like this:
myApiBeginCalculations()
{
//Start algorithm
.......
Sort(vector);
//Continue with algorithm
}
Then let's say after that algorithm is complete, the user wants to modify a given MyObject instance and start again:
myApiBeginCalculations();
myApiModifyProperty(myHandler, "PROPERTY", "VALUE");
myApiBeginCalculations();
myApiDeleteObject(myHandler);
myAPiBeginCalculations();
Internally myApi will be doing a bunch of things to the MyObject instances and I need a reliable way to keep track of individual instances on the client even as they get shuffled around.
You can use std::map in place of std::vector. So you can do look up quickly and remove objects whenever you need.
std::map<int, Object> Object_directory
You need to use an ID based on something that is both unique for each object and which remains constant for each object. Clearly an index into a vector you're continually rearranging does not qualify.
You haven't described the properties of the objects so I can't say whether there's something already suitable for this use, but if not then you can add something. You can assign an IDs to each object as you create them, or you could allocate the objects on the heap so that their addresses remain consistent as you, for example, sort a vector<unique_ptr<MyObject>>.
You'll have to consider each operation you need to perform and figure out the necessary performance. For example a linear search through the vector in order to find an object with a matching ID may be too slow for some purpose. In that case you'll have to figure out how to avoid that linear search, perhaps by keeping a map on the side or something, at the cost of having to keep the map updated during other operations.
I would suggest not generating an ID number at all. Simply use a real pointer to the actual Object instance instead. To hide it from the client, you can use void* or uintptr_t, and just have your API functions type-cast that value to an Object* pointer when needed. You can still keep track of the Object instances in a std::vector so you can perform your algorithms on the objects, but the order of the std:vector will not be important to clients, and deleting any given Object will not invalidate other object IDs.
typedef uintptr_t Object_ID;
typedef std::vector<Object*> ObjectVector;
typedef ObjectVector::iterator ObjectVectorIter;
ObjectVector objVec;
Object_ID myApiCreateObject()
{
try
{
std::auto_ptr<Object> obj(new Object);
objVec.push_back(obj.get());
return reinterpret_cast<Object_ID>(obj.release());
}
catch (const std::exception&)
{
return 0;
}
}
ObjectVectorIter myApiFindObject(Object_ID identifier)
{
Object *obj = reinterpret_cast<Object*>(identifier);
return std::find(objVec.begin(), objVec.end(), obj);
}
void myApiModifyProperty(Object_ID identifier, const char* propName, const char* propValue)
{
ObjectVectorIter iter = myApiFindObject(identifier);
if (iter != objVec.end())
iter->property[propName] = propValue;
}
void myApiDeleteObject(Object_ID identifier)
{
ObjectVectorIter iter = myApiFindObject(identifer);
if (iter != objVec.end())
{
Object* obj = *iter;
objVec.erase(iter);
delete obj;
}
}
Or, if you are using C++11:
typedef uintptr_t Object_ID;
typedef std::shared_ptr<Object> ObjectPtr;
typedef std::vector<ObjectPtr> ObjectVector;
typedef ObjectVector::iterator ObjectVectorIter;
ObjectVector objVec;
Object_ID myApiCreateObject()
{
try
{
ObjectPtr obj = std::make_shared<Object>();
objVec.push_back(obj);
return reinterpret_cast<Object_ID>(obj.get());
}
catch (const std::exception&)
{
return 0;
}
}
ObjectVectorIter myApiFindObject(Object_ID identifier)
{
Object *obj = reinterpret_cast<Object*>(identifier);
return std::find_if(objVec.begin(), objVec.end(), [obj](const ObjectPtr &p){ return p.get() == obj; });
}
void myApiModifyProperty(Object_ID identifier, const char* propName, const char* propValue)
{
ObjectVectorIter iter = myApiFindObject(identifier);
if (iter != objVec.end())
(*iter)->property[propName] = propValue;
}
void myApiDeleteObject(Object_ID identifier)
{
ObjectVectorIter iter = myApiFindObject(identifier);
if (iter != vec.end())
objVec.erase(iter);
}

c++ : alternative for Vector of references to avoid copying large data

I have spent some time looking for answers but didn't find anything that was satisfactory.
Just interested in how some more seasoned C++ people solve this kind of problem as now I am doing a little more production related coding than prototyping.
Let say you have a class that has say a unordered_map (hashmap) that holds a lot of data, say 500Mb. You want to write an accessor that returns some subset of that data in an efficient manner.
Take the following, where BigData is some class that stores a moderate amount of data.
Class A
{
private:
unordered_map<string, BigData> m_map; // lots of data
public:
vector<BigData> get10BestItems()
{
vector<BigData> results;
for ( ........ // iterate over m_map and add 10 best items to results
// ...
return results;
}
};
The accessor get10BestItems is not very efficient in this code because it first copies the items to the results vector, then the results vector is copied when the function is returned (copying from the function stack).
You can't have a vector of references in c__ for various reasons, which would be the obvious answer:
vector<BigData&> results; // vector can't contain references.
You could create the results vector on the heap and pass a reference to that:
vector<BigData>& get10BestItems() // returns a reference to the vector
{
vector<BigData> results = new vector<BigData>; // generate on heap
for ( ........ // iterate over m_map and add 10 best items to results
// ...
return results; // can return the reference
}
But then you are going to run into memory leak issues if you are not careful. It is also slow (heap memory) and still copies data from the map to the vector.
So we can look back at c-style coding and just use pointers:
vector<BigData*> get10BestItems() // returns a vector of pointers
{
vector<BigData*> results ; // vectors of pointers
for ( ........ // iterate over m_map and add 10 best items to results
// ...
return results;
}
But most sources say to not use pointers unless absolutely necessary. There are options to use smart_pointers and the boost ptr_vector but I rather try to avoid these if possible.
I do no that the map is going to be static so I am not too worried about bad pointers. Just one issue if the code will have to be difference to handle pointers. Stylistically this is not pleasant:
const BigData& getTheBestItem() // returns a const reference
{
string bestID;
for ( ........ // iterate over m_map, find bestID
// ...
return m_map[bestID] ; // return a referencr to the best item
}
vector<BigData*> get10BestItems() // returns a vector of pointers
{
vector<BigData*> results ; // vectors of pointers
for_each ........ // iterate over m_map and add 10 best items to results
// ...
return results;
};
E.g., if you want a single item then it is easy to return a reference.
Finally option is to simply make the Hash-map public and return a vector of keys (in this case strings):
Class A
{
public:
unordered_map<string, BigData> m_map; // lots of data
vector<string> get10BestItemKeys()
{
vector<string> results;
for (........ // iterate over m_map and add 10 best KEYS to results
// ...
return results;
}
};
A aTest;
... // load data to map
vector <string> best10 = aTest.get10BestItemKeys();
for ( .... // iterate over all KEYs in best10
{
aTest.m_map.find(KEY); // do something with item.
// ...
}
What is the best solution? Speed is important but I want ease of development and safe programming practices.
I would just go with a vector of pointers if the map is constant. You can always return const pointers if you want to avoid the data being changed.
References are great for when they work but there's a reason we still have pointers (for me this would fall under the category of being 'necessary').
I would do something similar to the following:
Class A
{
private:
unordered_map<string, BigData> m_map; // lots of data
vector<BigData*> best10;
public:
A()
: best10(10)
{
// Other constructor stuff
}
const vector<BigData*>& get10BestItems()
{
// Set best10[0] through best10[9] with the pointers to the best 10
return best10;
}
};
Note a few things:
The vector isn't being reallocated each time and is being returned as a constant reference, so nothing is allocated or copied when you call get10BestItems.
Pointers are just fine in this situation. The things you read about avoiding pointers were probably in relation to heap allocations, in which case std::unique_ptr or std::shared_ptr are now preferred.
This sounds like a job for boost::ref to me. Just change your original code slightly:
typedef std::vector<boost::ref<BigData> > BestItems;
BestItems get10BestItems()
{
BestItems results;
for ( ........ // iterate over m_map and add 10 best items to results
// ...
return results;
}
Now you're notionally only returning a reference to each item within your return vector making it small and cheap to copy (if the compiler isn't able to optimize away the return copy completely).
I usually use boost::range and I found it is invaluable in so many situations, especially the one you describe.
You can keep the range object and iterate over it, etc.
But I should mention I don't know what happens if you add/remove on object between when you get the range and when you use it, so you may want to check that out before using it.