Is there an stl container for this hierarchical model data? - c++

For a platform-independent model layer, I have hierarchical data (strings, actually) that look like this:
Item A
SubItem A
SubItem B
SubItem C
SubSubItem A
SubSubItem B
SubItem D
Item B
Item C
Now, within each "level" (Item, SubItem, SubSubItem, etc.) the items need to be sorted alphabetically.
Seems a simple solution would be to create a simple class with a sorted std::Vector or std::MultiMap to track its Children, and a pointer to its Parent. (and one root item). I would need to generally iterate through each item's children in a forward direction.
After construction/sorting, I do not need to add or delete items. Generally small numbers of items (hundreds).
This is for model organization of the backing data of an outline-style control.
Rolling a simple class would be easy, but this is such a common pattern — isn't there already a ready-made STL container with this behavior?

Nothing in STL itself, but you might find this useful:
tree.hh: an STL-like C++ tree class
Its API follows STL containers exactly, and it should do what you're looking for.
I believe their example is exactly what you are asking about (a tree with strings), in fact.

A simple solution:
Your keys are std::vector<GUID>, where the GUID is some type (maybe a GUID, or a pointer, or a string) that uniquely identifies each element. Children of an element simply have that elements std::vector<GUID> as a "prefix".
So long as your GUID are sortable via operator<, lexographic sorting on the std::vector will get things in the order you requested.
A map<std::vector<GUID>, Value> could be your container, or a std::vector< std::pair< GUID, Value > > that you sort by .first manually.
If your GUID type can have a "last element", you can find every child of {x,y,z} by finding lower_bound of {x,y,z} and upper_bound of {x,y,z,last_guid}. Giving it a "last element" is an advantage to not using a naked pointer.

No. Don't mean to be curt, but that is the answer; see e.g. Josuttis, or the standard. You'll have to create a class which parent/child pointers along the lines you suggested and use a vector or other standard container of those.

The answer to your question is no, there is no tree in the STL. The patterns you suggested are fine. Also see this question.

Related

How can I use the Iterator design pattern to provide multiple iterators for a wrapper container class in C++?

I am really a noob in C++, so please don't mind if my question shows lack of basic knowledge.
I have a class User that describes a user in my system. It contains simple fields, like string name, string email, int age, etc.
I also have a wrapper class Group that uses an std::list in order to store a collection of Users. What I want to do is to use the Iterator design pattern in order to provide two different iterators for the Group class. By using one of the iterators, I want to be able to go through the list sorted by the User's name and by using the other one I want to do same thing, except that it should be sorted by the User's age.
I read this article (https://www.robertlarsononline.com/2017/04/24/iterator-pattern-using-cplusplus/) on the Iterator design pattern but I'm not sure on how I can adapt the shown code to my case. I am guessing that my Group class needs to have two CreateIterator() methods (i.e: CreateAgeIterator() and CreateNameIterator()), each one returning a different specialization of the Iterator class. The problem is that I don't know where exactly I should put the logic of sorting the list according to a specific criterion.
This is just a basic project that I am doing to study a little bit of the language as well as the Iterator design pattern itself. I am not concerned with being STL compliant or anything, I just need a simple implementation of the concept.
I want to be able to go through the list sorted by the User's name and by using the other one I want to do same thing, except that it should be sorted by the User's age.
In order to iterate a list in a sorted order, the list must be sorted according to that order. But a list can only have one order.
One solution for two orderings is to sort the list using one order, and have another data structure, which contains pointers to the list elements, but sorted using the other ordering.
Note that the iterators of this other container don't point to the elements of the list directly, but instead to the pointers. As such, using the iterators of the pointer container itself is not the same as using iterators to the list that contains the elements.
The problem is that I don't know where exactly I should put the logic of sorting the list according to a specific criterion.
Enforce the the ordering whenever elements are inserted or removed.
Note that the elements of the list should be const so that you cannot break the ordering by modifying the state of the objects.

C++ multi-index map implementation

I'm implementing a multi-index map in C++11, which I want to be optimized for specific features. The problem I'm currently trying to solve, is to not store key elements more then once. But let me explain.
The problem arose from sorting histograms to overlay them in different combinations. The histograms had names, which could be split into tokens (properties).
Here are the features I want my property map to have:
Be able to loop over properties in any order;
Be able to return container with unique values for each property;
Accumulate properties' values in the order they arrive, but to be able to sort properties using a custom comparison operator after the map is filled;
I have a working implementation in C++11 using std::unordered_map with std::tuple as key_type. I'm accumulating property values as they arrive into a tuple of forward_lists. The intended use, is to iterate over the lists to compose keys.
The optimization I would like to introduce, is to only store properties' value in the lists, and not store them in tuples used as keys in the map. I'd like to maintain ability to have functions returning const references to lists of property values, instead of lists of some wrappers.
I know that boost::multi_index has similar functionality, but I don't need the overhead of sorting as the keys arrive. I'd like to have new property values stored sequentially, and only be sortable postfactum. I've also looked at boost::flyweight, but in the simplest approach, the lists will then be of flyweight<T> instead of T, and I'd like to not do that. (If that IS the best solution, I could definitely live with it.)
I know that lists are stable, i.e. once an element is created, its pointer and iterator remain valid, even after invoking list::sort(). Knowing that, can something be done to the map, to eliminate redundant copies of tuple elements? Could a custom map allocator help here?
Thanks for suggestions.
Have your map be from tuples of iterators to your prop containers.
Write a hash the dereferences the iterators and combines the result.
Replace the forward list prop containers with sets that first order on hash, then contents.
Do lookup by first finding in set, then doing lookup in hash.
If you need a different order for props, have another container of set iterators.

Design Pattern for Data Structure

This question has been answered, so what follows below is an explanation of what I wanted to achieve.
I wanted to create a tablular data structure designed to allow efficient access to any row entry through a primary column that could possibly be hashed. I thought that the best way to go about this would be to maintain a vector of doubly-linked lists, each of which would represent one column, and a map that would contain mappings of primary column entry hashes to nodes. Now, the first mistake I made is in thinking that I would need to create my own implementation of a doubly-linked list in order to be able to store pointers to nodes, when in fact the standard states that iterators to std::list do not get invalidated as a result of insertion or splicing (see larsmans's answer). Here's some pseudocode to illustrate what I wanted to do previously. Assume the existence of a typename T representing the entry type and the existence of a dlist and node class, as described previously.
typedef dlist<T> column_type;
typedef vector<T> row_type;
typedef ptr_unordered_map<int32_t, row_type> hash_type;
shared_ptr<ptr_vector<column_type> > columns;
shared_ptr<hash_type> hashes;
Now, after reading larsmans's answer, I learned that I wouldn't need any of this since Boost.MultiIndex fulfills all of my needs as it is. Even if I did, Boost.Intrusive offers more efficient data structures to accomplish what I describe.
Thanks to all who took interest in the question or offered help! If you have any more questions, add another comment and I'll do my best to clarify the question further.
front() should return a reference to a node containing the value_type
Sounds like your thinking of begin instead of front, in STL/Boost terms, except that begin methods usually return iterators instead of references.
How would I be able to use a map of key hashes to std::list::iterator types and allow for addition of rows without having the entries in the map get outdated
Just do; "lists have the important property that insertion and splicing do not invalidate iterators to list elements, and that even removal invalidates only the iterators that point to the elements that are removed" (STL docs).
If you wanted, you could maintain a single std::list for the entire table and a vector of iterators into it to represent the starting points of rows.
Besides, have you looked at Boost.Intrusive and Boost.MultiIndex? And did you know that an std::map (red-black tree) of hashes is a very suboptimal way of representing a hash table?

Which STL container for ordered data with key-based access?

Let's say I have a collection of Person objects, each of which looks like this:
class Person
{
string Name;
string UniqueID;
}
Now, the objects must be stored in a container which allows me to order them so that I can given item X easily locate item X+1 and X-1.
However, I also need fast access based on the UniqueID, as the collection will be large and a linear search won't cut it.
My current 'solution' is to use a std::list in conjunction with a std::map. The list holds the Persons (for ordered access) and the map is used to map UniqueID to a reference to the list item. Updating the 'container' typically involves updating both map and list.
It works, but I feel there should be a smarter way of doing it, maybe boost:bimap. Suggestions?
EDIT: There's some confusion about my requirement for "ordering". To explain, the objects are streamed in sequentially from a file, and the 'order' of items in the container should match the file order. The order is unrelated to the IDs.
boost:bimap is the most obvious choice. bimap is based on boost::multi_index, but bimap has simplified syntax. Personally I will prefer boost::multi_index over boost::bimap because it will allow to easily add more indices to the Person structure in the future.
There is no Standard Library container that does what you want - so you will have to use two containers or the Boost solution. If using two containers, I would normally prefer a vector or a deque over a list, in almost all circumstances.
Why not to use two maps , one having Person as Key and another one having UniqueId as Key, but that requires updating both of them.
you can create a callback function which updates both the maps whenever there is any change.

Is there a data structure that doesn't allow duplicates and also maintains order of entry?

Duplicate: Choosing a STL container with uniqueness and which keeps insertion ordering
I'm looking for a data structure that acts like a set in that it doesn't allow duplicates to be inserted, but also knows the order in which the items were inserted. It would basically be a combination of a set and list/vector.
I would just use a list/vector and check for duplicates myself, but we need that duplicate verification to be fast as the size of the structure can get quite large.
Take a look at Boost.MultiIndex. You may have to write a wrapper over this.
A Boost.Bimap with the insertion order as an index should work (e.g. boost::bimap < size_t, Foo > ). If you are removing objects from the data structure, you will need to track the next insertion order value separately.
Writing your own class that wraps a vector and a set would seem the obvious solution - there is no C++ standard library container that does what you want.
Java has this in the form of an ordered set. I don't thing C++ has this, but it is not that difficult to implement yourself. What the Sun guys did with the Java class was to extend the hash table such that each item was simultaneously inserted into a hash table and kept in a double linked list. There is very little overhead in this, especially if you preallocate the items that are used to construct the linked list from.
If I where you, I would write a class that either used a private vector to store the items in or implement a hashtable in the class yourself. When any item is to be inserted into the set, check to see if it is in the hash table and optionally replace the item in there if such an item is in it. Then find the old item in the hash table, update the list to point to the new element and you are done.
To insert a new element you do the same, except you have to use a new element in the list - you can't reuse the old ones.
To delete an item, you reorder the list to point around it, and free the list element.
Note that it should be possible for you to get the part of the linked list where the element you are interested in is directly from the element so that you don't have to walk the chain each time you have to move or change an element.
If you anticipate having a lot of these items changed during the program run, you might want to keep a list of the list items, such that you can merely take the head of this list, rather than allocating memory each time you have to add a new element.
You might want to look at the dancing links algorithm.
I'd just use two data structures, one for order and one for identity. (One could point into the other if you store values, depending on which operation you want the fastest)
Sounds like a job for an OrderedDictionary.
Duplicate verification that's fast seems to be the critical part here. I'd use some type of a map/dictionary maybe, and keep track of the insertion order yourself as the actual data. So the key is the "data" you're shoving in (which is then hashed, and you don't allow duplicate keys), and put in the current size of the map as the "data". Of course this only works if you don't have any deletions. If you need that, just have an external variable you increment on every insertion, and the relative order will tell you when things were inserted.
Not necessarily pretty, but not that hard to implement either.
Assuming that you're talking ANSI C++ here, I'd either write my own or use composition and delegation to wrap a map for data storage and a vector of the keys for order of insertion. Depending on the characteristics of the data, you might be able to use the insertion index as your map key and avoid using the vector.