The C++ STL apparently is missing an ordered tree data structure. See here. Boost is also missing an ordered tree, but it does have an "un"ordered one, Property Tree where the data is ordered by insertion. I want the order to be irrespective of memory.
The boost page on Property Trees says that this is conceptually the boost::ptree structure.
struct ptree
{
data_type data; // data associated with the node
list< pair<key_type, ptree> > children; // ordered list of named children by insertion
};
I want to extend boost to keep track of order.
Is this the correct way?
class ordered_ptree : public boost::property_tree::ptree {
public:
ordered_ptree(int id) : _id{id}{};
protected:
int _id;
};
(From the comments in your question, I understand you want something like Python's OrderedDict but taking into account keys' relative order.)
Since none of the standard library's (or boost's) containers are exactly what you want, you might want to extend std::map (especially if you don't need all of the interface).
Say you start with
template<
typename Key,
typename Value,
class Compare=std::less<Key>,
class Alloc=std::allocator<pair<const Key, Value> >
class ordered_map
{
// This needs to be filled.
};
Now inside, you can hold an insertion counter:
std::size_t m_ins_count;
which is initialized to 0 and incremented at each insert.
Internally, your new keys will be std::pairs of the original key and the insertion count. Standard properties of binary search trees imply that nodes with keys differing only by the second pair item (which is the insertion count), will be consecutive in an in-order walk, which means that
you retain order of different keys
you retain order of insertion within a key
the operations are logarithmic time
traversing same-key items is (amortized) linear time
So, internally you'd have something like
typedef
std::map<
std::pair<Key, std::size_t>,
Value,
lex_compare<Compare>,
std::allocator<std::pair<std::pair<Key, std::size_t>, Value> >
internal_map_t;
(where lex_compare<Compare> compares first by the given functor, then by the insertion index).
Now you can choose a (minimal) interface, and implement it, by translating keys in the "outer world" and pairs of keys + insertion indices in the "inner world" of the tree.
If you plan on providing an iterator interface as well, you might find the boost iterator library useful, as you simply want to modify std::map's iterators.
Related
Consider a hierarchical tree structure, where an item may have sibling items (at the same level in the hierarhcy) and may also have children items (one level down in hierarchy).
Lets say the structure can be defined like:
// an item of a hierarchical data structure
struct Item {
int data; // keep it an int, rather than <T>, for simplicity
vector<Item> children;
};
I wanted to be able to use algorithms over this structure, like the algorithms for a std::map, std::vector, etc. So, I created a few algorithms, like:
template <class Function>
Function for_each_children_of_item( Item, Function f ); // deep (recursive) traversal
template <class Function>
Function for_each_direct_children_of_item( Item, Function f ); // shallow (1st level) traversal
template <class Function>
Function for_each_parent_of_item( Item, Function f ); // going up to the root item
One thing that troubled me is that there are 3 for_each() functions for the same structure. But they give a good description of how they iterate, so I decided to live with it.
Then, soon, the need for more algorithms emerged (like find_if, count_if, any_of, etc), which made me feel I'm not on the right track, design-wise.
One solution I can think of, that would reduce the workload, would be to simply write:
vector<Item> get_all_children_of_item( Item ); // recursive
vector<Item> get_all_direct_children_of_item( Item ); // 1st level items
vector<Item> get_all_parents_of_item( Item ); // up to the root item
and then I could use all the STL algorithms.
I am a bit wary of this solution, because it involves copying.
I cannot think of a way to implement an iterator, as there is no obvious end() iterator in the recursive version of the traversal.
Can anybody present a typical / idiomatic way to deal with such non-linear data structures ?
Can/should iterators be created for such a structure? how?
Use iterators.
I cannot think of a way to implement an iterator, as there is no obvious end() iterator in the recursive version of the traversal.
end() can be any designated special value for your iterator class as long as your increment operator produces it when stepping past the last element. And/or override operator ==/!= for your iterator.
If you want to be really robust, implement an iterator mode for each of the XPath axes.
I need a map whose keys are of some composite type T to a vector of iterators of the map itself.
(e.g., think of a graph in which every node holds iterators that point to its parents.)
The reason I'm avoiding storing the parents' keys as values is that the keys are expensive objects, so I'd like to store iterators instead.
Is such a thing possible?
If not, what's the best alternative?
(I know I can use polymorphism and type erasure as a brute-force way to solve this, but I'm wondering if there's a better way.)
Due to 23.2.1 General container requirements:
Containers are objects that store other objects. They control allocation
and deallocation of these objects through constructors, destructors,
insert and erase operations.
Hence it is possible:
struct Key;
struct Value;
typedef std::map<Key, Value> Map;
struct Key {};
struct Value {
std::vector<Map::iterator> parents;
};
(All types and sizes are known)
You most likely want to store a smart pointer to the parent rather than an iterator.
I have two questions:
How can the Array Abstract data type be modified to implement an Associative Array?
How can the tree abstract data type be modified to implement an Associative Array?
To create an associative array out of an array, you'd typically start with an array of some sort of structure:
struct item {
key_type key;
value_type value;
};
Then you'd use keys to look up values. For the sake of efficiency, you'd typically want to sort the array based on the keys, so you could use a binary search (or an interpolating search, if there's any degree of predictability to your key distribution).
For a tree, you'd do pretty much the same, except that for a tree a binary search is the default. You end up with a node pretty similar to that for an array, plus a couple of pointers:
struct node {
key_type key;
value_type value;
struct node *left;
struct node *right;
};
Depending on the type of tree involved, you might also want another pointer to create a threaded tree and/or some balance information (e.g., for an AVL or R-B tree). Conversely, for a B-Tree you'd end up with arrays of nodes about like for the associative array, and link those together into a balanced tree.
Good day!
In his "Effective STL" Scott Meyers wrote
A third is to use the information in an ordered container of iterators to iteratively splice the list's elements into the positions you'd like them to be in. As you can see, there are lots of options. (Item 31, second part)
Can someone explain me this way?
More text (to understand the context):
The algorithms sort, stable_sort, partial_sort, and nth_element require random access iterators, so they may be applied only to vectors, strings, deques, and arrays. It makes no sense to sort elements in standard associative containers, because such containers use their comparison functions to remain sorted at all times. The only container where we might like to use sort, stable_sort, partial_sort, or nth_element, but can't, is list, and list compensates somewhat by offering its sort member function. (Interestingly, list::sort performs a stable sort.) If you want to sort a list, then, you can, but if you want to use partial_sort, or nth_element on the objects in a list, you have to do it indirectly. One indirect approach is to copy the elements into a container with random access iterators, then apply the desired algorithm to that. Another is to create a container of list::iterators, use the algorithm on that container, then access the list elements via the iterators. A third is to use the information in an ordered container of iterators to iteratively splice the list's elements into the positions you'd like them to be in. As you can see, there are lots of options.
I'm not sure what the confusion is but I suspect that it is what "splicing" refers to: the std::list<T> has an splice() member function (well, actually several overloads) which transfer nodes between lists. That is, you create a std::vector<std::list<T>::const_iterator> and apply the sorting algorithm (e.g. std::partial_sort()) to this. Then you create a new std::list<T> and use the splice() member with the iterators from the sorted vector to put the nodes into their correct order without moving the objects themselves.
This would look something like this:
std::vector<std::list<T>::const_iterator> tmp;
for (auto it(list.begin()), end(list.end()); it != end; ++it) {
tmp.push_back(it);
}
some_sort_of(tmp);
std::list<T> result;
for (auto it(tmp.begin()), end(tmp.end()); it != end; ++it) {
result.splice(result.end(), list, it);
}
Let's say you wanted to do a partial_sort on a list. You could store the iterators to the list in a set, by providing a comparison function that can sort using the iterators, like this:
struct iterator_less
{
bool operator() (std::list<int>::iterator lhs,
std::list<int>::iterator rhs) const
{
return (*lhs < *rhs);
}
};
typedef std::multiset<
std::list<int>::iterator, iterator_less
> iterator_set;
The you could let set perform the sort, but since it contains iterators to list, you could you list::splice to splice them into a partial_sorted list:
std::list<int> unsorted, partialSorted;
unsorted.push_back(11);
unsorted.push_back(2);
unsorted.push_back(2);
unsorted.push_back(99);
unsorted.push_back(2);
unsorted.push_back(4);
unsorted.push_back(5);
unsorted.push_back(7);
unsorted.push_back(34);
// First copy the iterators into the set
iterator_set itSet;
for( auto it = unsorted.begin(); it!=unsorted.end();++it)
{
itSet.insert(it);
}
// now if you want a partial_sort with the first 3 elements, iterate through the
// set grabbing the first item in the set and then removing it.
int count = 3;
while(count--)
{
iterator_set::iterator setTop = itSet.begin();
partialSorted.splice(
partialSorted.begin(),
unsorted,
*setTop);
itSet.erase(setTop);
}
partialSorted.splice(
partialSorted.end(),
unsorted,
unsorted.begin(),
unsorted.end());
An ordered container would be either std::set or std::map. If you're willing to make a comparator that takes iterators you would use std::set<std::list<mydata>::iterator,comparator>, otherwise you could use std::map<mydata,std::list<mydata>::iterator>. You go through your list from begin() to end() and insert the iterators into the set or map; now you can use it to access the items in the list in sorted order by iterating the set or map, because it's automatically sorted.
Ordered containers are std::set and std::multiset. std::set implements a BST. So what it says is that you should crate an std::set<std::list::iterators> and then use the inherent BST structure to do the sorting. Here is a link on BST to get you started.
Edit Ah. Just noticed "ordered container of iterators". That would imply creating an index into another container.
Boost Multi Index has many example of such things (where a single collections is indexed by several different ordering predicates and the indices are nothing more than collections of 'pointers' (usually iterators) into the base container.
"A third is to use the information in an ordered container of iterators to iteratively splice the list's elements into the positions you'd like them to be in"
One thing I think would match that description is when doing std::sort_heap of a list/vector which has had std::make_heap/push_heap/pop_heap operating on it.
make_heap : convert a sequence to a heap
sort_heap : sort a heap
push_heap : insert an element in a heap
pop_heap : remove the top element from a heap
Heaps are organizations of elements within sequences, which make it (relatively) efficient to keep the collection in a known ordering under insert/removal. The order is implicit (like a recursive defined binary tree stored in a contiguous array) and can be transformed into the corresponding properly sorted sequence by doing the (highly efficient) sort_heap algorithm on it.
I have a structure
struct dbdetails
{
int id;
string val;
};
I need a data structure in C++ that can hold structure variable with a sort capability. Is it possible? I was looking at vector, which can hold structure variable, but I will not be able to sort it based on id, because it is a structure member. Any suggestions?
You need a custom functor for comparing your tries. This should do the trick:
#include <algorithm>
#include <vector>
// try is a keyword. renamed
struct sorthelper : public std::binary_function<try_, try_, bool>
{
inline bool operator()(const try_& left, const try_& right)
{ return left.id < right.id; }
};
...
std::vector<try_> v;
// fill vector
std::sort(v.begin(), v.end(), sorthelper());
...
Please feel free to ask if you have any follow-up questions. Do you possess the Stroustrup book?
Edit: Suggestion of Matteo:
struct try_
{
int id;
string val;
bool operator<(const try_& other) const
{return id < other.id;}
}; // no s here plz.
...
std::vector<try_> v;
// fill vector
std::sort(v.begin(), v.end());
...
You could use a std::map. They are sorted by key, so you could do:
std::map<int, std::string> myStuff;
This is a map with an int as key and std::string as value. When you iterate over the map, you’ll find that it’s automatically sorted by the key.
Note you would no longer need your struct with this solution. If you absolutely need the data in a struct (perhaps to interface with some external library) you could always copy data from the map into a struct as needed.
You can have a vector of struct's and then sort them as:
std::sort(vectStruct.begin(), vectStruct.end(), &vectStructSort);
bool vectStructSort(Try const& lhs, Try const& rhs) { // try is keyword.
return lhs.id < rhs.id;
}
It depends by what requirements you have on your data container.
You may find useful a set (in Stl, Set is a Sorted Associative Container to store objects of type Key). Or even a Hash set, or a sorted array.
If you know that you need your elements to be sorted, it is maybe better to use a sorted container, instead of sorting it each time you need to.
All ordered containers (std::set, std::map, std::multiset, std::multimap) are, well, ordered. Non ordered containers (std::list, std::vector, std::deque) can be ordered by providing a comparison function an using std::sort (vector, deque) or by providing that comparator to a member method (list).
It all boils down to what you actually need. If you need to keep the elements sorted at all times, then a sorted container might be more efficient than modifying the container and resorting. On the other hand, if having the container sorted at all times is not a requirement, but being able to modify the elements then you might prefer a vector. Sorted containers maintain the keys as constant objects, as modification of the keys would break the sort invariant.
In some cases the container needs to be sorted at all times, but it does not change after some initialization phases. In that case a non-sorted container that gets sorted after initialization can be fine.
You can sort a vector based on struct members. You just need a custom comparator.